17 September 2008

Backing up your data...

After yesterday's "rsync update" post, I decided to do some serious reading on the subject and here are the interesting pages that I found.

First of all, even before considering the method you'll choose, what matters is the reliability of the backed up data. In other words, how much of your data and its meta-data is saved in the process.

I found a terrific post on the subject on the http://blog.plasticsfuture.org/ blog. The author tries pretty much all the existing free solutions at the time of his writing (10.4.6) and checks how the data is handled by the various tools. The result is "The State of Backup and Cloning Tools under Mac OS X". It is quite technical, you've been warned. The results are appalling. When seen from his perspective.

This post and its follow-up, "Mac Backup Software Harmful" seem to have caused quite a stir in the OSX back up software world and it seems a number of the tools discussed have been fixed to a degree. For a more relaxing but still related read, check File Creation Dates on Mac OS X: Clash of the Cultures from the same author about the conservation of the file creation date data. The post is very interesting because it shows two totally different approaches to what a "creation date" is supposed to mean depending on different ideas of what a file is.

In reaction to the two original plasticsfuture posts comes "Introducing Backup Bouncer", where the author introduces a test suite to easily compare the original data to the back up.

When you've read all this you should know quite a bit more about the issues at hand.

For a totally different approach, but still in reaction to the plasticsfuture articles, inik.net has a more pragmatic article: "Ensuring trouble-free backups from your Mac to not-a-Mac" followed by "File copying/synchronization software and your metadata (and data!)".

By the way, before you reach this point, you may want to know what you are actually backing up... What is a file in the OSX world and its underlying Unixy universe ? For that you may want to check Google Books and their limited preview of "A practical guide to Unix for Mac OS X users, by Mark G. Sobell, Peter Seebach. Check the Table of Contents, "The Mac OSX File System" and browse down to page 99 (if you have a better reference available on the net, leave a comment).

Now, here are two pages that give a good summary of the situation, in terms that most of us will understand. The first is Take Control of Mac OS X Backups: The Online Appendixes from TidBits, and the second is Mac OS X Backups (can't get much simpler than that...) from "Seth's Unix Tips". You may want to read his take on Unix files too.

The conclusion of all this is that, depending on your needs you'll have to make a choice. Don't forget that Leopard comes with Time Machine, which creates hourly incremental backups of the data you specify. So, if you need something different then here is my short list:

GUI application

SuperDuper, to "backup and clone your drives". The application is not free software but comes with a limited version that won't cost you anything.

rsync based command line applications

Configuring Mac OS X for Unattended Backup Using rsync
Easy Automated Snapshot-Style Backups with Linux and Rsync
rsnapshot (based on the previous link)
LBackup (similar to rsnapshot), check the "Alternatives to LBackup" section for more links, as well as its rsync tips page.



Is that all there is to backing up one's data ?

There must be a number of rules available from somewhere on the web regarding data backup...

The few I have in mind are:

  • Back up regularly. Time Machine does that every hour and that is a good thing. If you don't have an automated solution, do that manually every single day. But you do have an automated solution on you Mac. It is a command line utility that is called cron, and if the command line is really too much for you, check cron's GUI wrapper "Cronnix, you'll still need to understand a few things but you won't have to play with Terminal.app.

  • Check the integrity of your data regularly (like once a week), by simply taking a look at it, opening a few files at random see if they correspond to what you expect them to be etc...

  • Test the recovery process once in a while. Backed up data is useless if it can't be recovered. This test is to make sure you remember the method and to ensure that the restored data is in a useable state.

  • Do your back up on an external media, pretty obvious, a possibly a media that is physically removed from you data source (the computer). Most modern external HDs can be linked to your machine with relatively long cables. You can also use wireless connections to access that disk etc.

  • If you have extra cash, get a backup computer to make sure that you can use the restored data while your main computer is being fixed. Because it is very likely that a major computer failure will be the main reason why you'll need to restore your data. Now what if you have data to restore but no computer to restore it to ?
I think that is pretty much everything I have to say on the subject for today. You may want to take a look at this old post where I discuss what happens when Time Machine saves your application folders and Spotlight indexes all that...

16 September 2008

Software updates !

Free software

NeoOffice has turned 2.2.5

NeoOffice is a free software replacement for Microsoft Office and other similar office suites. It is based on OpenOffice.org.

VirtualBox has turned 2.0.2

VirtualBox is a free software replacement for Parallels and VMWare and other similar virtualization software.

rsync has turned 3.0.4

rsync is a command line utility to backup your files on a remote system. It does smart incremental backups so that you don't have to copy huge file sets when only a few files have been modified. rsync finds the modified parts and will add only the modified part to the original backup.
rsync 2.6.9 is installed by default on OSX 10.5.