There are two kinds of data in computing: the sort that's already lost and the stuff that isn't - yet.
You can spend a fortune on a storage medium that's anti-scratch, dust-resistant, heat-proof and contains no moving parts, but it'll all come to naught eventually if you haven't also invested effort in backing your data up.
Although it isn't particularly time consuming, backing up data requires careful thought and preparation, and involves more than just zipping files into a tarball. This means it's often neglected.
Note that an archive isn't a backup and it's important to know the difference between the two. An archive is a primary copy of data that's put away for future use. A backup, on the other hand, is a secondary copy that you call upon to recover your important files and information from data loss disasters.
So no matter what kind of user you are, or how you use your Linux distribution, this article has got something for you. Most of the backup tools discussed here only require a bit of thought and a little time to set up. Best of all, unless you've got terabytes of data, you can safely file it for little or no cost both on and offline.
We'll also discuss ways to organise and store your data more efficiently so that it's easily accessible and simple to back up. You need never lose data again.
A primer to the thought process behind making your data safe
Preparing for a backup involves careful consideration. For starters, where do you store your data? Keeping it on another partition of the same disk isn't advisable - what if the whole disk fails? A copy on another disk is one solution.
To protect your data against physical disasters, such as fires, foods and theft, keep the backup as far away from the original as possible, perhaps on the cloud.
Each method has it's advantages: hard disks offer the best price-to-space ratio and are also a convenient and readily available option, Flash drives offer portability, optical media's easily distributable, and online storage is globally accessible.
The kind of data also influences the choice of storage medium. A DVD might be useful for holiday snapshots, but is of limited use to a pro photographer. If you'll be backing up large quantities of data, it's advisable to get multiple, high-capacity hard disks. Or you might want to invest in a NAS (network attached storage) box.
Another option would be to create your own cloud by attaching USB disks to network accessible devices such the PogoPlug or TonidoPlug. Figure out which of these options best suits your needs.
What to backup?
Depending on the size of your home directory, backing it up completely could be overkill. Here are the essentials:
Your documents and files
~/Documents, ~/Downloads, ~/Desktop
Most modern distros keep the files you've created or downloaded under these directories. Don't forget to check /home for any important documents.
Your email data (Evolution/Thunderbird/Kmail)
~/.evolution, ~/.thunderbird, ~/.kde/share/apps/kmail
Depending on your client, one of these should contain your emails, plus their attachments, your address book and so on.
Other apps' data
Other apps create their own data repositories to store files. Most prompt you for the location, while some create their own. Check under their Preferences to search these out.
Installed software
/var/cache/apt, /var/cache/yum
If there's a piece of software that's crucial to you and you don't want to spend time downloading it again, back it up.
Personal settings
.bashrc, .profile, .gnupg/, .local/, .openoffice/, .mozilla/
These are some of the essential hidden directories that store user settings. Back them up for every user in your installation. Be vigilant, though. Some contain Cache directories, such as Firefox (under ~/.mozilla/firefox/whvmajqx.default/Cache for us), which needlessly add to the backup's size.
System settings
/etc, /var/spool/cron/, /var/spool/mail, /boot,
Pay close attention to these directories if you're backing up your entire installation. You'll find system settings in /etc. Although it's got a large number of files, it isn't very bulky. This is unlike /var. It contains cache directories for several apps you can miss out, plus /var/spool/mail, which houses the user mail files, and /var/spool/cron, which has the settings for cron, both of which you should back up.
If you've made changes elsewhere in the system, consider backing up those files under /usr/ and /usr/local/.
Data considerations
Now we know what to back up, so let's consider how to go about it. Do you want to back up manually or automatically based on a schedule? The correct frequency varies based on the kind and value of data being safeguarded.
Depending on the size of the files, it might not be a good idea to back them up completely every day either. Many backup tools enable you to do incremental backups - only creating copies of files that have changed since the last backup.
Will you manipulate the data before safeguarding it? If you're backing up large quantities of data, it's advisable to compress it. If the data's sensitive, you can encrypt it too. Remember that both add to backup overheads.
Finally, to ensure the data's integrity, checksum and validate it regularly.
Step-by-step: Crontab entries from a GUI
1. Create your crontab
Despite its simplicity, automating tasks with Cron can be a tricky task if you are not used to it. Corntab (www.corntab.com) is a browser-based visual front-end that helps you cook up an appropriate crontab entry.
2. Email it
The Corntab interface has sliders and check boxes to help you pick both the time (in minutes, hours, days of the month, months and days of the week) and command that you wish to schedule with Cron.
3. Paste into crontab
When you're done, copy or email the crontab entry, and paste it into crontab from the command line with the crontab -e command. When you save and exit the crontab editor, the new entry will be activated.
Protect your data easily with these no-fuss tools for beginners
Déjà Dup
Aren't yet used to the ways of a backup tool? Then Déjà Dup is for you. It has a minimal interface so as to not overwhelm new users, yet it's based on the powerful command linebased Duplicity and integrates nicely with Gnome.
Pulled from the repositories, Déjà Dup installs under Applications > System Tools. Before you use it, you'll need to set its Preferences. Start by pointing it towards the location where you want to house your backups. This can be a local hard disk, a remote location via SSH, or Amazon's S3 web storage.
Then specify the list of directories you want to include in and exclude from the backup. By separating these two, Déjà Dup gives you the flexibility to include a large directory - for instance, /home - in your backup, while specifying parts to leave out, such as .cache/.
By default, Déjà Dup encrypts your backups, but you can ask it not to do so by unchecking the Encrypt Backup Files box. Next to it is a pull-down menu that enables you schedule regular backups.
When you're done, click the Backup icon to invoke the process. If you've opted to encrypt the data, Déjà Dup now prompts you for a password. It then provides a summary list of the directories involved and begins.
This initial backup may take some time, but subsequent backups are incremental - dealing only with what's changed - and thus much faster.
When restoring backups, Déjà Dup enables you to restore them to their original location or under a specific directory. Since the backup's directory contains encrypted material, you'll be prompted for your password again.
Finally, you're presented with a time-stamped list of backups to restore. That's all there's to it.
Déjà Dup is ideal for backing up files under a user's /home directory, but you might run into authorisation issues with system files. Also Déjà Dup doesn't allow you to create backup sets. So if you wish to back up a different directory, you'll have to modify the Preferences.
Similarly, in order to restore from different locations, you'll have to change the location first under Preferences.
LuckyBackup
While Déjà Dup is suitable for most users, if you want something that's able to handle multiple backup schemes, then use LuckyBackup.
Among its strong points is that it supports multiple profiles, enabling you to manage different backup sets. A default profile is created when you first launch the app and, like all profiles, must have a task attached - either to perform a backup or restore data from one.
Tasks can be one of three types: you can select to back up just the contents of a directory, replicate the entire source directory as is, or you can synchronise the source and destination, which is handy when you need to keep files found under two directories in sync.
When the synchronisation task is executed, LuckyBackup checks for the newest version of a file under both the source and destination directories and copies them to the other. So newly created files in one location are replicated in the other. The only drawback is that if you have deliberately deleted a file/folder in one location but not its counterpart, these will be automatically recreated.
Elsewhere, the Advanced button expands the New Task dialogue to give you fine control over the files to include in, and exclude from, the backup. If you'll be backing up to a remote directory, specify your connection details under the Remote tab.
Power users will appreciate the convenience of the Also Execute tab, which enables you to specify a list of commands to execute before and after the backup.
When you're done creating a backup, click the Validate button to ensure your settings are good to go. With all your tasks for multiple locations set up, it's time to schedule them. Head over to Profile > Schedule, and click Add. Now select the profile to schedule and customise its run time.
Finally, click the CrontIT! button, which automatically creates a Cron job for the backup. To manually run a backup, select the task to execute and click Start. You might also want to check the Simulator box to simulate the backup and ensure it will run properly.
The process of restoring a backup in LuckyBackup is just a backup task with the directories reversed. Also remember to uncheck the Skip Newer Destination Files box under the Command Options tab in the Advanced view.
Finally, execute the restore task as usual and your backed up data will be reinstated in its original place.
Enterprise solutions
BackupPC
If you manage a computer lab or work in an enterprise setting, backing up individual computers using the tools we've covered so far would be a chore. When you have a bunch of machines to take care of, it's best to rely on BackupPC. Be warned, however, that it's not for the faint of heart, despite its web-based interface and extensive documentation.
While it can be used on individual machines, it's best called upon when you want to safeguard data on multiple computers. Not only that, but it will work across Linux, Mac, or Windows, and is well suited for environments that have a mix of different OSes.
It has impressive features too, including pooling. This reduces backup sizes by saving only one copy of identical files that exist on many computers. For example, if you have the same distro running on all computers, BackupPC will only keep one copy of the system files.
Install and configure
You can install BackupPC from your distro's repository, or get the latest version via the tarball.
Before you extract and install it, make sure you have the following Perl modules: Compress::Zlib, Archive::Zip, XML::RSS, Net::FTP and File::RsyncP.
You can install them using CPAN a la: perl -MCPAN -e 'install Compress::Zlib
With the various libraries in place, you should download the tarball, untar it and then enter the following: perl configure.pl
When you run configure.pl, you'll be prompted for the full paths of various executables and for configuration information such as the BackupPC user, the data directory and so on. By default, the configuration files will be stored in /etc/backuppc.
Once it's set up, you can start the program with /etc/init.d/backuppc start
The basic BackupPC configuration can be edited via the app's web interface, which you'll find by pointing your browser towards localhost/backuppc. Use the username and password you specified when configuring BackupPC to login to this.
The interface also lets you browse the various hosts as well as initiate backup and restore operations. You can edit basic configuration settings from the Edit Config menu. Use the Add button under the Edit Hosts section to include a client to back up.
In order to set up individual clients, you'll have to manually edit their configuration files, and provide details depending on the method used for backing up (BackupPC supports SMB, TAR, Rsync and FTP).
An /etc example
For example, the following backs up the /etc directory on localhost using TAR: $Conf = 'tar'; $Conf = ['/etc']; $Conf = '/usr/bin/env LC_ALL=C $tarPath -c -v -f - -C $shareName'
To begin the back up, head to the web interface, select a host and then click Start Full Backup. The Status page will show you which backups are running. Alternatively, you could also perform an incremental backup if you have previously archived files to add to.
With backup data in place, BackupPC enables you to view and restore individual files, or complete filesystems. You can either download the backed up files as zipped archives, or directly restore them into their original computer.
There's far more to BackupPC than we can touch on here; it's the most comprehensive program in this feature. As such, you'll need to spend time browsing its documentation and adapting it to your network to make full use of it. Our in-depth tutorial in LXF125 may also help if you have access to it.
You can spend a fortune on a storage medium that's anti-scratch, dust-resistant, heat-proof and contains no moving parts, but it'll all come to naught eventually if you haven't also invested effort in backing your data up.
Although it isn't particularly time consuming, backing up data requires careful thought and preparation, and involves more than just zipping files into a tarball. This means it's often neglected.
Note that an archive isn't a backup and it's important to know the difference between the two. An archive is a primary copy of data that's put away for future use. A backup, on the other hand, is a secondary copy that you call upon to recover your important files and information from data loss disasters.
So no matter what kind of user you are, or how you use your Linux distribution, this article has got something for you. Most of the backup tools discussed here only require a bit of thought and a little time to set up. Best of all, unless you've got terabytes of data, you can safely file it for little or no cost both on and offline.
We'll also discuss ways to organise and store your data more efficiently so that it's easily accessible and simple to back up. You need never lose data again.
A primer to the thought process behind making your data safe
Preparing for a backup involves careful consideration. For starters, where do you store your data? Keeping it on another partition of the same disk isn't advisable - what if the whole disk fails? A copy on another disk is one solution.
To protect your data against physical disasters, such as fires, foods and theft, keep the backup as far away from the original as possible, perhaps on the cloud.
Each method has it's advantages: hard disks offer the best price-to-space ratio and are also a convenient and readily available option, Flash drives offer portability, optical media's easily distributable, and online storage is globally accessible.
The kind of data also influences the choice of storage medium. A DVD might be useful for holiday snapshots, but is of limited use to a pro photographer. If you'll be backing up large quantities of data, it's advisable to get multiple, high-capacity hard disks. Or you might want to invest in a NAS (network attached storage) box.
Another option would be to create your own cloud by attaching USB disks to network accessible devices such the PogoPlug or TonidoPlug. Figure out which of these options best suits your needs.
What to backup?
Depending on the size of your home directory, backing it up completely could be overkill. Here are the essentials:
Your documents and files
~/Documents, ~/Downloads, ~/Desktop
Most modern distros keep the files you've created or downloaded under these directories. Don't forget to check /home for any important documents.
Your email data (Evolution/Thunderbird/Kmail)
~/.evolution, ~/.thunderbird, ~/.kde/share/apps/kmail
Depending on your client, one of these should contain your emails, plus their attachments, your address book and so on.
Other apps' data
Other apps create their own data repositories to store files. Most prompt you for the location, while some create their own. Check under their Preferences to search these out.
Installed software
/var/cache/apt, /var/cache/yum
If there's a piece of software that's crucial to you and you don't want to spend time downloading it again, back it up.
Personal settings
.bashrc, .profile, .gnupg/, .local/, .openoffice/, .mozilla/
These are some of the essential hidden directories that store user settings. Back them up for every user in your installation. Be vigilant, though. Some contain Cache directories, such as Firefox (under ~/.mozilla/firefox/whvmajqx.default/Cache for us), which needlessly add to the backup's size.
System settings
/etc, /var/spool/cron/, /var/spool/mail, /boot,
Pay close attention to these directories if you're backing up your entire installation. You'll find system settings in /etc. Although it's got a large number of files, it isn't very bulky. This is unlike /var. It contains cache directories for several apps you can miss out, plus /var/spool/mail, which houses the user mail files, and /var/spool/cron, which has the settings for cron, both of which you should back up.
If you've made changes elsewhere in the system, consider backing up those files under /usr/ and /usr/local/.
Data considerations
Now we know what to back up, so let's consider how to go about it. Do you want to back up manually or automatically based on a schedule? The correct frequency varies based on the kind and value of data being safeguarded.
Depending on the size of the files, it might not be a good idea to back them up completely every day either. Many backup tools enable you to do incremental backups - only creating copies of files that have changed since the last backup.
Will you manipulate the data before safeguarding it? If you're backing up large quantities of data, it's advisable to compress it. If the data's sensitive, you can encrypt it too. Remember that both add to backup overheads.
Finally, to ensure the data's integrity, checksum and validate it regularly.
Step-by-step: Crontab entries from a GUI
1. Create your crontab
Despite its simplicity, automating tasks with Cron can be a tricky task if you are not used to it. Corntab (www.corntab.com) is a browser-based visual front-end that helps you cook up an appropriate crontab entry.
2. Email it
The Corntab interface has sliders and check boxes to help you pick both the time (in minutes, hours, days of the month, months and days of the week) and command that you wish to schedule with Cron.
3. Paste into crontab
When you're done, copy or email the crontab entry, and paste it into crontab from the command line with the crontab -e command. When you save and exit the crontab editor, the new entry will be activated.
Protect your data easily with these no-fuss tools for beginners
Déjà Dup
Aren't yet used to the ways of a backup tool? Then Déjà Dup is for you. It has a minimal interface so as to not overwhelm new users, yet it's based on the powerful command linebased Duplicity and integrates nicely with Gnome.
Pulled from the repositories, Déjà Dup installs under Applications > System Tools. Before you use it, you'll need to set its Preferences. Start by pointing it towards the location where you want to house your backups. This can be a local hard disk, a remote location via SSH, or Amazon's S3 web storage.
Then specify the list of directories you want to include in and exclude from the backup. By separating these two, Déjà Dup gives you the flexibility to include a large directory - for instance, /home - in your backup, while specifying parts to leave out, such as .cache/.
By default, Déjà Dup encrypts your backups, but you can ask it not to do so by unchecking the Encrypt Backup Files box. Next to it is a pull-down menu that enables you schedule regular backups.
When you're done, click the Backup icon to invoke the process. If you've opted to encrypt the data, Déjà Dup now prompts you for a password. It then provides a summary list of the directories involved and begins.
This initial backup may take some time, but subsequent backups are incremental - dealing only with what's changed - and thus much faster.
When restoring backups, Déjà Dup enables you to restore them to their original location or under a specific directory. Since the backup's directory contains encrypted material, you'll be prompted for your password again.
Finally, you're presented with a time-stamped list of backups to restore. That's all there's to it.
Déjà Dup is ideal for backing up files under a user's /home directory, but you might run into authorisation issues with system files. Also Déjà Dup doesn't allow you to create backup sets. So if you wish to back up a different directory, you'll have to modify the Preferences.
Similarly, in order to restore from different locations, you'll have to change the location first under Preferences.
LuckyBackup
While Déjà Dup is suitable for most users, if you want something that's able to handle multiple backup schemes, then use LuckyBackup.
Among its strong points is that it supports multiple profiles, enabling you to manage different backup sets. A default profile is created when you first launch the app and, like all profiles, must have a task attached - either to perform a backup or restore data from one.
Tasks can be one of three types: you can select to back up just the contents of a directory, replicate the entire source directory as is, or you can synchronise the source and destination, which is handy when you need to keep files found under two directories in sync.
When the synchronisation task is executed, LuckyBackup checks for the newest version of a file under both the source and destination directories and copies them to the other. So newly created files in one location are replicated in the other. The only drawback is that if you have deliberately deleted a file/folder in one location but not its counterpart, these will be automatically recreated.
Elsewhere, the Advanced button expands the New Task dialogue to give you fine control over the files to include in, and exclude from, the backup. If you'll be backing up to a remote directory, specify your connection details under the Remote tab.
Power users will appreciate the convenience of the Also Execute tab, which enables you to specify a list of commands to execute before and after the backup.
When you're done creating a backup, click the Validate button to ensure your settings are good to go. With all your tasks for multiple locations set up, it's time to schedule them. Head over to Profile > Schedule, and click Add. Now select the profile to schedule and customise its run time.
Finally, click the CrontIT! button, which automatically creates a Cron job for the backup. To manually run a backup, select the task to execute and click Start. You might also want to check the Simulator box to simulate the backup and ensure it will run properly.
The process of restoring a backup in LuckyBackup is just a backup task with the directories reversed. Also remember to uncheck the Skip Newer Destination Files box under the Command Options tab in the Advanced view.
Finally, execute the restore task as usual and your backed up data will be reinstated in its original place.
Enterprise solutions
BackupPC
If you manage a computer lab or work in an enterprise setting, backing up individual computers using the tools we've covered so far would be a chore. When you have a bunch of machines to take care of, it's best to rely on BackupPC. Be warned, however, that it's not for the faint of heart, despite its web-based interface and extensive documentation.
While it can be used on individual machines, it's best called upon when you want to safeguard data on multiple computers. Not only that, but it will work across Linux, Mac, or Windows, and is well suited for environments that have a mix of different OSes.
It has impressive features too, including pooling. This reduces backup sizes by saving only one copy of identical files that exist on many computers. For example, if you have the same distro running on all computers, BackupPC will only keep one copy of the system files.
Install and configure
You can install BackupPC from your distro's repository, or get the latest version via the tarball.
Before you extract and install it, make sure you have the following Perl modules: Compress::Zlib, Archive::Zip, XML::RSS, Net::FTP and File::RsyncP.
You can install them using CPAN a la: perl -MCPAN -e 'install Compress::Zlib
With the various libraries in place, you should download the tarball, untar it and then enter the following: perl configure.pl
When you run configure.pl, you'll be prompted for the full paths of various executables and for configuration information such as the BackupPC user, the data directory and so on. By default, the configuration files will be stored in /etc/backuppc.
Once it's set up, you can start the program with /etc/init.d/backuppc start
The basic BackupPC configuration can be edited via the app's web interface, which you'll find by pointing your browser towards localhost/backuppc. Use the username and password you specified when configuring BackupPC to login to this.
The interface also lets you browse the various hosts as well as initiate backup and restore operations. You can edit basic configuration settings from the Edit Config menu. Use the Add button under the Edit Hosts section to include a client to back up.
In order to set up individual clients, you'll have to manually edit their configuration files, and provide details depending on the method used for backing up (BackupPC supports SMB, TAR, Rsync and FTP).
An /etc example
For example, the following backs up the /etc directory on localhost using TAR: $Conf = 'tar'; $Conf = ['/etc']; $Conf = '/usr/bin/env LC_ALL=C $tarPath -c -v -f - -C $shareName'
To begin the back up, head to the web interface, select a host and then click Start Full Backup. The Status page will show you which backups are running. Alternatively, you could also perform an incremental backup if you have previously archived files to add to.
With backup data in place, BackupPC enables you to view and restore individual files, or complete filesystems. You can either download the backed up files as zipped archives, or directly restore them into their original computer.
There's far more to BackupPC than we can touch on here; it's the most comprehensive program in this feature. As such, you'll need to spend time browsing its documentation and adapting it to your network to make full use of it. Our in-depth tutorial in LXF125 may also help if you have access to it.