The rsbackup Client is the initiator in all actions pertaining to a backup in the rsBackup system. While the main purpose is to place files on the backup server, it is expandable through modules to perform other actions, the most common of which is to act as a collector of data from other, internal machines. In this mode, it can act as a backup concentrator, either copying data from those machines to its local hard drive, or by mounting remote shares in the trees to be backed up.

== Overview ==

# If defined, rsbackup Client executes any initialization scripts set up by the administrator.
# rsbackup Client sends a "prepare" command to the server and waits for it to complete. This allows the server to complete any initialization on itself
# rsbackup Client initiates an rsync session with the server, backing up any modified local files. The output is saved in a log file
# rsbackup Client sends a "cleanup" command to the server and waits for it to complete. This allows the server to complete any cleanup on itself
# If defined, rsbackup Client executes any local cleanup script set up by the administrator
# A report containing a summary of the rsync command and the output of all of the previous steps is sent to the administrative e-mail account. The log file is compressed and included as an attachment.

== Before Installation ==

Prior to installation, you will need to set up the server side portion with the administrator of the target backup device. Send her the following:

* ssh public key for root
* Public IP Address of server
* ClientName value
* machine name

=== ssh public key ===
The ssh public key for root on your server can not contain a password if you wish automated backups. You can create the public/private key pair with the following command (as root)
 ssh-keygen -t rsa -b 4096
and select the defaults for all prompts. Note, feel free to use dsa, and increase your bit size if you will. So long as you have created a public/private key pair. In the the above example, you would send /root/.ssh/id_rsa.pub to the administrator.

=== Public IP Address of server ===
If you are behind a firewall, determine the correct IP address. One way of doing this is
 lynx http://www.whatismyip.com/

=== ClientName and machine name ===
The ClientName and machine name must be the same on the server and the client. This is your unique identifier to the server. Client Name should be pure alphanumeric, though it can contain any characters accepted by the server. Specifically, it should not contain spaces or Linux shell characters. Machine name can be any valid DNS name, ie alph-numerics, dashes and periods.

Get the above information to the administrator. E-mail is fine as none of the information is a security risk (well, minor). Give the administrator time to set up the server.

== Installation ==

Add the following lines to /etc/apt/sources.list
 #
 # Daily Data Repository
 #
 deb     http://debian.dailydata.net/debian_repository /

 apt-get update && apt-get install rsbackup
Several dependencies may install, the main ones being rsync, ssh and the rsbackup library. For non-debian based systems, there is no installer.

== Configuration ==

Configuration for a base system is fairly straight forward. You will want to edit two files; /etc/rsbackup/rsbackup.conf and, possibly, /etc/cron.d/rsbackup.

=== /etc/rsbackup/rsbackup.conf ===
Change the following lines. They are listed in the order found in the configuration file
Remove the pound sign from the start of this line to uncomment it. It will put your installation in testing mode
 # $CONTROL{'--dry-run'} = sub { return 1; } ;
The e-mail address that should receive the reports.
 $MAIL_TO = 'backups@example.com';
Following values are provided by the rsbackup server administrator.
 $BACKUP_SERVER = 'root@backup.example.com'; # full server name/rootPath to back up to (or IP)
 $PATH_ON_SERVER = '/home/backups';
These two values are agreed upon by you and the server administrator. They must match the values on the rsbackup server
 $CLIENT_NAME = 'ClientName';
 $MY_NAME = 'server.example.com';         # used to determine target subdir for storage.
List of local directories to back up. Any valid Perl list that resolves to root directories may be used here. Attempting to back up anything in /var may result in back backups (ie, don't try to back up /var/lib/mysql unless you really know what you are doing)
 @DIRS_TO_BACKUP = ('/etc', '/root', '/home');        # which directories on source are to be backed up
Where you want the backup logs to go. rsbackup Client, by itself, does not clean backup logs. In this case, they are store in root, which is backed up, so the log files are being backed up
 $LOGDIR = '/root/backup_logs';                  # absolute path for log directory. No trailing slash needed.

==== Optional ====
If the server administrator requires it, you should also change one or both of the following two lines under @OPTIONS
Set the maximum speed of the backup in kB/s. Your admin may require a cap on the speed
 '--bwlimit=64',           # limit backup speed to 64KB/s (which is 512kilobits/sec)
If the server uses a non-standard port for ssh (recommended), replace the number 22 below with the 1-5 digit number provided by the administrator
 '--rsh="ssh -p 22"'

=== /etc/cron.d/rsbackup (optional) ===
The defaults are to run rsbackup at 1am, and mail errors to root. Additionally, rsbackup jobs may be killed after a certain period if necessary.

=== /etc/rsync.exclude ===

/etc/rsync.exclude is a list of files which will be ignored by rsync. Common things like *.tmp, *.bak, *~ can be safely excluded. Read the rsync manual to see more about exclusions here. This is included in the rsync command line via --exclude-from=/etc/rsync.exclude

== Testing install ==

First, if you have created a new public key, you must initialize the ssh system. As root, enter the following command, changing the port and server name
 ssh -p 22 root@backup.example.com
You will get a message stating the server is unknown, and asking permission to add that servers keys to known_hosts. Agree. You will now get an error message from the server stating that you are not authorized to log in (and the server administrator will get an e-mail advising of a possible cracking attempt).

Now, for the real test. I suggest you edit @DIRS_TO_BACKUP in the configuration file to only test the backup of one small directory, say /etc, for this. To do this, simply change those lines as follows:
 # @DIRS_TO_BACKUP = ('/etc', '/root', '/home');        # which directories on source are to be backed up
 @DIRS_TO_BACKUP = ('/etc');        # used for testing
Note the pound sign at the beginning of the "real" line, which comments that line out during this test. Now execute:
 rsbackup
If all goes well, you will receive nothing on the screen until the backup is complete. When the backup is complete, you will see a compressed log file in your LOGDIR directory. Open that and view the contents. You should see a valid rsbackup log. If you have any errors, repair them, then proceed to the next step

== Production Mode ==

Change the lines you modified above, and place a comment in front of the --dry-run line to place the machine in a production mode. See the following example:
Comment out the --dry-run for production
 # $CONTROL{'--dry-run'} = sub { return 1; } ;
Uncomment the real @DIRS_TO_BACKUP and either comment the other line or remove it completely
 @DIRS_TO_BACKUP = ('/etc', '/root', '/home');        # which directories on source are to be backed up
 # @DIRS_TO_BACKUP = ('/etc');        # used for testing

You can now do an initial seed backup, or simply wait until the cron job fires off the service. If you want to do an initial seed backup, it is easier to do as a background job. Move to a convenient location as you will have the screen output (ie, errors) sent to a file nohup.out in that directory, and execute the command:
 nohup rsbackup &
After a few seconds, control of your keyboard will return to you and you can move into the log directory. You should see the log file starting after the server has completed any initialization. You can then follow the backup by tailing the log file. Or, simply log out and check the results later (you will receive the backup report in e-mail).

== Expanding Functionality via Modules ==

There are various modules included with rsbackup in /usr/share/doc/rsbackup/examples. You can use these to perform pre and post backup actions such as making hot backups of databases, svn repositories, backing up other machines to your server, or mounting network shares for backup. The scripts in this area may be used for that purpose or you may create your own scripts. In rsbackup.conf, find the lines
 $INITIALIZATION_SCRIPT = '/etc/rsbackup/initialize';     # optional script to execute immediately after configuration file read
 $CLEANUP_SCRIPT = '/etc/rsbackup/cleanup';      # optional script to clean up when backup done
and you will see they point the fully qualified path/filename of an executable that will be executed before and after the backup (respectively). It is RECOMMENDED, but not required, you point them to a script named /etc/rsbackup/initialize and /etc/rsbackup/cleanup (the defaults), then have those scripts call others in /etc/rsbackup/modules. The scripts are executed, and the output from them returned to rsbackup and included in the log reports.  If the scripts return anything other than a 0, it is treated as an error. In the case of initialize, it will stop the entire backup procedure except for reporting.

Again, the following are recommended, but not required. For all of the following examples, it is assumed we are working with initialize and cleanup from the default install.

=== Backing up mysql databases ===

Originally, we wrote our own backupdb script, but found one specific to mysql that is much better, so for pure mySQL installations, we recommend it. It allows you to keep dumps for as long as you want, and specifically exclude databases. It does require the mysql root password placed in the clear in it, so this file should be owned by root and have permissions 700.
 cp -av /usr/share/doc/rsbackup/examples/backup_mysql_initialize /etc/rsbackup/modules/backup_mysql
 chown root:root /etc/rsbackup/modules/backup_mysql
 chmod 700 /etc/rsbackup/modules/backup_mysql

edit this file for your particular implementation. I have modified it slightly by changing the date format used for the backup file names (I prefer YYYY-MM-DD for sorting), but other than that is is pretty stock.

Once you have edited it, do a test run, as root:
 /etc/rsbackup/modules/backup_mysql
You should see backups of all your databases in the $DEST/mysql location. If not, fix whatever problems you have and retest. Once done, add the following line to /etc/rsbackup/initialize
 /etc/rsbackup/modules/backup_mysql

Now, when rsbackup is called, it will first back up the databases, then perform a remote backup.

Similarly, you can use the file backup_svn to back up SVN repositories, adn backup_postgres.initialize to back up postgres databases.

NOTE: This is very inefficient since each new daily backup is creating a new file for every database. While most small databases can be backed up this way with little impact, a better option would be to dump the databases as simple ASCII dumps (ie, do not compress them) and overwriting the previous backup. Due to the efficient manner  in which rsync copies files that have changed slightly, it will result in smaller transfers at the cost of disk space and lack of multiple copies of the backup. A way of doing this can be shown by modifying the backup_postgres.initialize module as follows:
Remove the $NOW. from the POSTGRES_BACKUP_NAME and do not do the compression
 # POSTGRES_BACKUP_NAME=$NOW.posgres.dmp # old way
 # su postgres -c 'pg_dumpall -d -c' | $GZIP -9 > $DBSAVEPATH/$POSTGRES_BACKUP_NAME.gz
 # $CHOWN root:root $DBSAVEPATH/$POSTGRES_BACKUP_NAME.gz
 # $CHMOD 600 $DBSAVEPATH/$POSTGRES_BACKUP_NAME.gz
 # more efficient, less ability to recover from previous backups
 POSTGRES_BACKUP_NAME=posgres.dmp
 su postgres -c 'pg_dumpall -d -c'  > $DBSAVEPATH/$POSTGRES_BACKUP_NAME
 $CHOWN root:root $DBSAVEPATH/$POSTGRES_BACKUP_NAME
 $CHMOD 600 $DBSAVEPATH/$POSTGRES_BACKUP_NAME

=== Backing up other servers ===
==== Creating a local backup on this server ====
Use the getServer script from the examples directory, then create a local backup on your server which is then backed up to the remote server
 cp -av /usr/share/doc/rsbackup/examples/getServer /etc/rsbackup/modules/getServer
 chown root:root /etc/rsbackup/modules/getServer
 chmod 700 /etc/rsbackup/modules/getServer

This module can be run in two ways, either by mounting a remote network share (samba, nfs, sshfs) on the local tree and copying it or, the more efficient rsync. It also can send a separate e-mail report to an additional user with any results. In both cases, for unattended backups, the root user on this machine must be able to perform an action on the remote one.

Also, you can create per-machine exclude files. Our standard practice is to create those in /etc/rsbackup/excludes. The format is an rsync exclude file, containing a list of files and directories which will be ignored (they can also be specifically added). And example would be:

 - RECYCLER/
 - System Volume Information/
 # some common suffixes for backup files
 - *~
 - *.BAK
 - *.bak
 - *.tmp
 - *.TMP
 - /home/versions

In this case, a RECYCLER directory anywhere in the subtree will not be backed up, and neither will a directory named System Volume Information (spaces permitted here). /home/versions is an absolute path to be excluded. Also, and file ending in a ~, .BAK, .bak, .tmp and .TMP will be ignored.

==== Backup by Mounting External Shares ====
You must create the mount definition in /etc/fstab. An example would be:
 //172.16.0.3/private$  /media/private  cifs ro,noauto,credentials=/root/private.credentials  0       0
Note that we have entered the CIFS (Windows) share by IP address, mounting it read-only and noauto. Also, the credentials are taken from the file /root/private.credentials which is simply a list of the username and password required by the remote windows server. It is similar to this:
 username=backup
 password=backMeUp
where backup is the login name the remote server accepts and backMeUp is that users password. However you do it, if root can mount the share simply by issuing
 mount /media/private
with no username or password, it will work. To tell rsbackup to back this machine up BEFORE beginning its remote backup, add the following lines to /etc/rsbackup/initialize
 BACKUP_SCRIPT=/etc/rsbackup/modules/getServer
 # following can be set to --testing=[1234] for various levels of testing
 # empty means normal operations
 TESTING=''
 MAILTO="--mailto='local_backup_admin@example.com'"
 MAILLOG='--maillog'
 echo Beginning internal backup at `date`
 echo Backing up Windows Servers
 $BACKUP_SCRIPT --server=private --connection=smb --parameters=' --exclude-from=/etc/rsbackup/excludes/private.exclude --delete-excluded' $TESTING $MAILTO $MAILLOG

==== Backup via rsync ====

This is much more efficient as it allows the rsync to perform compression on the transfers, and allows the remote machine to share the comparison activities. It requires an installation of rsync on the networked machine, and the ability for root to log in via ssh without a password (unattended). Set this up by adding the following lines to the above script:
echo server.example.local
 $BACKUP_SCRIPT --server=server.example.local --connection=rsync --directories='/etc,/root,/home' --parameters=' --exclude-from=/etc/rsbackup/excludes/server.example.local.exclude' $TESTING $MAILTO $MAIL

=== Mounting remote shares locally ===
If you do not want a local copy of another share, you can simply mount it prior to performing your backup in any tree that is backed up. One option is to mount all remote shares under /media, then include /media in the list. A small script named mountDrive (and unmountDrive) are included. To use these, set up the remote shares as described above and add the following line(s) to initialize, one for each remote share:
 /etc/rsbackup/modules/mountDrive /media/servername
and modify /etc/rsbackup/rsbackup.conf to include the media tree:
 @DIRS_TO_BACKUP = ('/etc', '/root', '/home', '/media');        # which directories on source are to be backed up
Since you do not want to leave this share mounted after the backup is finished, you should add the following line(s) to /etc/rsbackup/cleanup
 /etc/rsbackup/modules/unmountDrive /media/servername
OR, you could add the following block to simply unmount all mounted shares
 for share in /media ; do /etc/rsbackup/modules/unmountDrive /media/$share ; done

== Customizations - Real world studies of customizations ==

=== User wants additional safeguards of backup copies on local USB drives ===
In this case, the client wanted to not only back up on the local machine (specifically created for this purpose) and the remote backup server, but also wanted to know if they could use a external USB drives, in rotation, to make copies the owner could take home. The idea was they would purchase three external USB drives and swap them each day, resulting with one copy at the owners house, one in her possession, and one attached to the backup concentrator. The client also wanted a separate report sent to them of this backup.

First, I took their three USB drives and formatted each of them, with the same label on each (backup; I'm so imaginative).
 mke2fs -jm 0 -L backup /dev/sdxx
Then, i created a directory in /media named backup
 mkdir /dev/media/backup
and set up an entry in fstab
 LABEL=backup  /media/backup  ext3  defaults 0  0
Finally, I created a bash script that would do what I wanted, and put it into /etc/rsbackup/modules
 #! /bin/bash
 USBLOG=/root/usbbackup.log
 MAILTO=backups@example.com
 if /etc/rsbackup/modules/mountDrive /media/backup
 then
    rm $USBLOG.gz
    rsync --verbose --exclude-from=/etc/rsync.exclude --owner --group --checksum --recursive --delete --delete-excluded --stats /home /etc /root /media/backup/ 1>$USBLOG 2>&1
    gzip $USBLOG
    mpack -a -s "Backup Report `date`" $USBLOG.gz $MAILTO
    umount /media/backup
 fi
Finally, I installed mpack on the server and tested the backup.
 /etc/rsbackup/modules/backup_usb
Once I had the bugs out, I added this to /etc/rsbackup/cleanup

=== User wanted multiple versions on their server ===

The user in this case was using a separate machine as a local backup (I call it a backup concentrator), but did not want those versions sent to the remote server. They only wanted the current backup on their server.

cp -al seemed the best command for this. If you are not familiar with this command, it replicates a directory tree but instead of copying the files, it creates a hard link to to each file in the target pointing back to the original file's allocation.

Some detail about this if you are not familiar with it. A "file" on a Unix server is simply a "pointer" to the appropriate spaces on the physical drive (inode???) occupied by the files contents. Creating a file is a matter of allocating space on the disk, then pointing the file entry to the allocated space. This is a very simplistic description, and not wholely accurate.

When you create a hard link, you simply "point" the new file to the space allocated to an existing file entry. Thus, you can create a file, 'a', then create a hard link to that file named 'b' and the two are indistiquishable from each other. They are literally the same thing. Edit 'a' and view 'b' and see the changes, etc... When you remove a file (also called unlinking), you are removing the information stating that a "file" owns some disk space. So, in the above case, performing a 'rm a' would release 'a' from the file tree, but since 'b' still points there, the space is not freed up and 'b' still has the contents.

The nice thing about this is the way rsync works. When rsync updates a file, it creates a temporary file, completes the copy of that file from its source, then unlinks the original file and renames the temporary file to the original name. Sounds round-about, but it is actually very good as it allows rsync to always leave the remote copy in a usable state. If 'b' points to 'a', and then 'a' is updated via rsync, 'b' still points to the contents of 'a' before 'a' was updated.

Taking advantage of this, I used getServers to bring all remote shares into /home/backups, and set up the appropriate lines in /etc/rsbackup/initialize. I then created a script for /etc/rsbackup/cleanup named makeVersion (see it in /usr/share/doc/rsbackup-client/examples) which created a hard link from /home/backups to /home/versions/. makeVersion creates a directory name under /home/versions consisting solely of the date in YYYY-MM-DD format. That allowed me to rapidly determine "old" copies which could then be removed.

The result is a state aware directory tree for each given day (this happens daily). In any "date tree", if a file was deleted or modified, it still exists in the previous date tree in its original form. A file that was added obviously does not exist, and and a file that was not modified is there, but only at the expense of a file entry, not a copy of the original file taking up additional space.

The nice thing about this is the efficiency. When a replica has been created, the replica occupies (on their system) around 0.2% of the original space. As items are modified, new space is allocated so this increases. Removing older trees on a regular basis can, however, keep this to a managable number.

==== CAVEATS ====
cp -al builds an in-memory tree for all files during its copy. This results in a memory usage directly proportional to the number of files/directories used. In a 1 million file tree, the memory required is approximately 750M. If you don't want to allocate this much memory, do the copy in smaller increments. For example, when using the getServer script, each remote share appears in a directory under /home/backups, so doing something like
 for share in /home/backups ; do makeVersion $share ; done # incomplete syntax, obviously
would mean you only need to allocate memory for the largest share you have backed up.

As version trees are separated from the current tree by more and more edits, the space allocated solely to the version tree grows, and can become significant. Because rsync can not recognize things like file/directory tree renames, it can grow very rapidly if someone renames a root directory containing many files. In this case, the data IS stored multiple times.