rsync is probably the most common solution for backup on Linux, because it is extremely fast and makes incremental backups (transfer “only differences” between two versions of a file), which is extremely useful in network backups.
However, synchronization (sync) a folder to a backup device, is far from being considered “backup”, as it does not keep old versions of files. The problem is solved with rsync as described in the famous article Easy Automated Snapshot-Style Backups with Linux and Rsync, by Mike Rubel.
rsnapshot offers this functionality, using rsync and perl:
- based on settings (interval) it creates folders (snapshots) like daily.0, daily.1, daily.2 etc weekly.0, weekly.1, weekly.2 etc (keeping daily backup for 7 days, 4 weeks, six months etc)
- it uses hard links (must be supported by the operating system) for the files that have not changed over the time, in order to consume less disk space
- it provides a very friendly interface where all files in a snapshot are available using just a file manager (making restore incredibly easy compared to any other backup solutions)
- files were changed are saved again (does not save differences). From this perspective, it can be considered differential backup solution. However, if you lose a snapshot, a next one which contains hard links to lost snapshot, is practically useless (typical situation in incremental backup solutions)
- it is simple, fast and reliable
You cannot use Amazon S3 (or similar services) with rsnapshot. This is a remarkable limitation. In this case you can use Duplicity.
Disk usage
To determine the actual space occupied by files in a rsnapshot
tree, using du
(disk usage) as following:
du -csh /backup/rsnapshot/*
The result is:
254G /store/backup_athena/rsnapshot/daily.0
26G /store/backup_athena/rsnapshot/daily.1
18G /store/backup_athena/rsnapshot/daily.10
494M /store/backup_athena/rsnapshot/daily.11
643M /store/backup_athena/rsnapshot/daily.12
545M /store/backup_athena/rsnapshot/daily.13
9.5G /store/backup_athena/rsnapshot/daily.2
9.3G /store/backup_athena/rsnapshot/daily.3
9.4G /store/backup_athena/rsnapshot/daily.4
9.5G /store/backup_athena/rsnapshot/daily.5
728M /store/backup_athena/rsnapshot/daily.6
646M /store/backup_athena/rsnapshot/daily.7
761M /store/backup_athena/rsnapshot/daily.8
561M /store/backup_athena/rsnapshot/daily.9
40G /store/backup_athena/rsnapshot/monthly.0
709M /store/backup_athena/rsnapshot/weekly.0
718M /store/backup_athena/rsnapshot/weekly.1
1.5G /store/backup_athena/rsnapshot/weekly.2
382G total
daily.1 folder contains all the files of daily.0 as hard links and some even.
However, using du
as following:
du -csh /backup/rsnapshot/daily.1
you will get:
254G /store/backup_athena/rsnapshot/daily.1
254G total
Install rsnapshot
Debian/Ubuntu
sudo apt-get install rsnapshot
Redhat/Fedora
yum install rsnapshot
Archlinux
pacman -S rsnapshot
Configure rsnapshot
CAUTION: in rsnapshot.conf parameters and their values must be separated by a tab (eg cmd_cp → /bin/cp) and no spaces.
Actually, the basic settings are:
- snapshot_root position on the disk where the backup is saved (snapshot tree)
- interval how many snapshots are kept
- backup which files will be backup
Edit configuration file:
nano /etc/rsnapshot.conf
configure (using your own values):
#################################################
# rsnapshot.conf - rsnapshot configuration file #
#################################################
# #
# PLEASE BE AWARE OF THE FOLLOWING RULE: #
# #
# This file requires tabs between elements #
# #
#################################################
#######################
# CONFIG FILE VERSION #
#######################
config_version 1.2
###########################
# SNAPSHOT ROOT DIRECTORY #
###########################
# All snapshots will be stored under this root directory.
#
snapshot_root /store/rsnapshot/
# If no_create_root is enabled, rsnapshot will not automatically create the
# snapshot_root directory. This is particularly useful if you are backing
# up to removable media, such as a FireWire or USB drive.
#
no_create_root 1
#########################################
# BACKUP INTERVALS #
# Must be unique and in ascending order #
# i.e. hourly, daily, weekly, etc. #
#########################################
#
# Please note that 'interval' directive is
# a deprecated alias for 'retain'
# so, instead of:
#
##interval hourly 6
#interval daily 14
#interval weekly 4
#interval monthly 6
#
# it is better to use:
#
#retain hourly 6
retain daily 14
retain weekly 4
retain monthly 6
############################################
# GLOBAL OPTIONS #
# All are optional, with sensible defaults #
############################################
# Verbose level, 1 through 5.
# 1 Quiet Print fatal errors only
# 2 Default Print errors and warnings only
# 3 Verbose Show equivalent shell commands being executed
# 4 Extra Verbose Show extra verbose information
# 5 Debug mode Everything
#
verbose 4
# Same as "verbose" above, but controls the amount of data sent to the
# logfile, if one is being used. The default is 3.
# If you want the rsync output, you have to set it to 4
#
loglevel 5
# If you enable this, data will be written to the file you specify. The
# amount of data written is controlled by the "loglevel" parameter.
#
logfile /var/log/rsnapshot.log
###############################
### BACKUP POINTS / SCRIPTS ###
###############################
# LOCALHOST
backup /home/ .
backup /etc/ .
backup /data/ .
...
Exclude files
Please read rsync documentation for:
--exclude=PATTERN exclude files matching PATTERN
--exclude-from=FILE read exclude patterns from FILE
You may use a file which contains all your exclude patterns:
############################################
# GLOBAL OPTIONS #
# All are optional, with sensible defaults #
############################################
# The include_file and exclude_file parameters, if enabled, simply get
# passed directly to rsync. Please look up the --include-from and
# --exclude-from options in the rsync man page for more details.
#
exclude_file /path/to/exclude/file
However, you may define exclude patters for each backup point. Here is an example:
###############################
### BACKUP POINTS / SCRIPTS ###
###############################
backup /etc/ ./ exclude=mtab
backup /data/ ./ exclude=data/.Trash-1000,exclude=data/lost+found
Remote rsnapshot backups
It is not a good idea to mount remote shares with samba or cifs. Rsnapshot will not work properly (rsync cannot create hard links in this case). However, you may use nfs to mount remote shares.
But, the common case is to use ssh for remote
rsnapshot backups, using root account. You have to use
RSA key authentication (without passphrase in most cases). In
untrusted networks DO NOT expose root account, use a
backup user with advanced privileges (for example,
modify sudoers file to permit rsync
command without
password for this user).
You cannot set the snapshot_root to a remote SSH path. In other words, you cannot push backups to a remote server. You can pull them locally from a remote server.
Uncomment cmd_ssh in rsnapshot.conf
#################################
# EXTERNAL PROGRAM DEPENDENCIES #
#################################
# Uncomment this to enable remote ssh backups over rsync.
#
cmd_ssh /usr/bin/ssh
Here is an example of backup points
###############################
### BACKUP POINTS / SCRIPTS ###
###############################
backup root@192.168.1.51:/etc/ my-remote1/
backup root@192.168.1.51:/data/ my-remote1/ exclude=data/.Trash-1000,exclude=data/lost+found
Check settings
Use:
rsnapshot configtest
If there are no mistakes
Syntax OK
Dry run (testing)
rsnapshot -t daily
Run rsnapshot
To run rsnapshot use:
rsnapshot hourly
or
rsnapshot daily
or
rsnapshot weekly
or
rsnapshot monthly
or a combination of the above.
These commands can be called
- via script (more compatible with workstations) or
- via cron job (more compatible with servers because they run continuously)
Script example
The following script performs:
rsnapshot daily
whenever calledrsnapshot weekly
first time with the completion of daily backups (based on rdaily interval) and then every 7 times which is executedrsnapshot monthly
first time with the completion of the daily and weekly backups (based on rdaily and rweekly interval) and then every 30 times which is executed
Whenever it is executed, it increases a counter in rsnapshot_counter file.
Script code – MIT License
#!/usr/bin/env bash
# parameters
rdaily=14
rweekly=4
rsnapshot_counter="/root/scripts/rsnapshot_counter"
# create rsnapshot counter if not exist
if [ ! -f $rsnapshot_counter ]; then echo "0" > $rsnapshot_counter; fi
# increase counter
rc=`cat $rsnapshot_counter`
rc=$(expr $rc + 1)
echo $rc > $rsnapshot_counter
# check for weekly intervals
tmp1=$(expr $rc - $rdaily)
if [ $tmp1 -ge 0 ]; then week_limit=$(expr $tmp1 % 7); else week_limit=-1; fi
# check for monthly intervals
tmp2=$(expr $rweekly - 1)
tmp3=$(expr $tmp2 * 7)
tmp4=$(expr $tmp1 - $tmp3)
if [ $tmp4 -ge 0 ]; then month_limit=$(expr $tmp4 % 30); else month_limit=-1; fi
# run rsnapshot
rsnapshot daily
if [ $week_limit -eq 0 ]; then rsnapshot weekly; fi
if [ $month_limit -eq 0 ]; then rsnapshot monthly; fi
Cron job example
As root, create a cron job:
nano /etc/cron.d/rsnapshot
you may use
0 */4 * * * root /usr/bin/rsnapshot hourly
30 3 * * * root /usr/bin/rsnapshot daily
0 3 * * 1 root /usr/bin/rsnapshot weekly
30 2 1 * * root /usr/bin/rsnapshot monthly
These settings perform:
- rsnapshot hourly every four hours (starting at 00:00)
- rsnapshot daily every day at 3:30
- rsnapshot weekly every Monday at 03:00
- rsnapshot monthly every 1st of the month at 02:30
Entrepreneur | Full-stack developer | Founder of MediSign Ltd. I have over 15 years of professional experience designing and developing web applications. I am also very experienced in managing (web) projects.