I'll start by saying what we have in place currently:

  • On site file server (Mac OS X Server) that is used by GFX designers and they have a working 1TB of data.
  • Offsite server with 2TB available storage (Centos 6)
  • Mac OS X server rsync data to offsite server every 6 hours (rsync -avz --delete --progress -e ssh ...)
  • Mac OS X server does full data backup to LTO 4 tape on a 10 day recycle (Mon-Fri for 2 weeks)
  • rsync pushes about 60GB of file changes a day.

The problem:

  • The onsite tape backup is failing as 1TB of graphics files don't compress well to fit onto a 800GB LTO4 tape.
  • Backup is incredibly slow doing a full backup.
  • Pain in the backside getting people to remember to change the tape. Often gets forgotten
  • etc

The quick solution:

  • Buy LTO5 Drive and tapes. However this has been turned down because of the cost...

What I would like:

  • Something that works in the same way rysnc works. Only changed data is sent over the wire and can be scheduled to run multiple times during the day. Data that is sent is compressed and sent over SSH.
  • Something that keeps a 14day retention but doesn't keep duplicate data
  • So as an example if I have 1TB of working data and 60GB of changes are made each day then I expect around 1.84TB of data to be stored on the offsite server.
  • To work with the Mac OS X server and Centos 6 server.
  • Not cost an arm and a leg. Must be a cheaper solution than buying an LTO5 drive with tapes (around £1500).
  • Be able to be setup to run autonomously.
  • Have some sort of control panel that will allow an admin to easily restore a file/folder.

Any recommendations?

  • Maybe you can configure sparkleshare.org to do exactly what you want. But I don't know it well enough. Commented Jan 5, 2012 at 13:00
  • I've had to deal with convincing management on purchasing a new backup system. The key is to not focus on the cost of the new system. Instead, focus on the efficiency and reliability. These are the critical aspects when determining a data restoration system. The more complicated the process, the more prone to have bad backups to restore from.
    – Justin
    Commented Jan 5, 2012 at 21:34

5 Answers 5


Try rsnapshot (rsnapshot.org). It does exactly what you're after: it's in the RPMForge yum repos (so is packaged for CentOS), operates over rsync via SSH, and keeps a configurable number of incremental backups.

edit: You could implement a recovery front end by exporting the rsnapshot datastore over NFS or Samba (or a webserver/anything else) and let users pick up old copies of their files themselves.

  • I like this. Only thing its missing is the GUI for easy restoration. If anything else doesn't come up then I'll definitely test this out.
    – Scott
    Commented Jan 5, 2012 at 13:17

I use rsnapshot to handle approximately the same data volumes you're talking about and it works quite nicely. As has already been pointed out it doesn't have a fancy front-end but it does a great job at snapshot retention and minimizing file storage space.

For GUI-based tools, consider looking into CrashPlan at http://www.crashplan.com. There are several cost levels (Home, Pro, etc.), one of which may suit your needs. I believe it's Java-based but IIRC it came with its own JRE. I use it for home backups, and I briefly tested the Pro (server-based) version but other things distracted me from giving it a full evaluation. But it looked promising.

One thing to watch out for regardless of which solution you use is handling (or ignoring) of resource fork data. Your OSX server deals with resource forks transparently, but you may lose the resource forks if you are using applications and/or filesystems that are not aware of them and thus would discard them. Maybe it doesn't matter in your environment, but it's worth pointing out that the data can get dropped pretty easily and without warning.

  • Nice catch on the resource fork thing! I never realised this and just checked my backups to find any files using resource fork data are 0kb files. Luckily they are only font files so not end of the world. Looking at crash plan now.
    – Scott
    Commented Jan 5, 2012 at 18:06
  • whoopis.com/howtos/rsync-hfs-howto nice article getting rsync to work with resource forks. Maybe there is a rsnapshot version too
    – Scott
    Commented Jan 5, 2012 at 18:15
  • @Brady: Before you decide that missing or zero-size resource forks are a non-issue for fonts, you might want to do a test restore and see if the font is still usable. Fonts are notorious for storing data that's more critical than you might expect in the resource forks. I've had many fonts become un-usable after being handled by utilities that were not resource-fork-aware (e.g. scp, rsync, etc.) Good luck!
    – jon
    Commented Jan 6, 2012 at 14:56

Would just buying a bunch of cheapo =>1TB USB hard disks and treating them like tapes, but with the ability to rsync to/from them, do what you need?

  • With regards to this, you could also look into Virtual Tape Library technology.
    – Dan
    Commented Jan 5, 2012 at 13:00
  • Not ideal really. Although I say 1TB that does often reach 1.2TB until some archiving is done. The cost of HDDs are very expensive ATM and buying 10 1.5TB external HDDs isn't cheap. Plus backing up to HDD's over USB2 is going to be slower than out tape setup. This solution doesn't tick many boxes. I know there are better solutions out there.
    – Scott
    Commented Jan 5, 2012 at 13:05
  • Well, you're going to need the space somewhere no matter what and it doesn't get much cheaper than consumer hard drives per GB
    – Dan
    Commented Jan 5, 2012 at 13:07
  • @Dan maybe so but to me its a solution that comes at an additional cost. A cost that's not needed when we have 2TB of offsite backup already paid for.
    – Scott
    Commented Jan 5, 2012 at 13:13
  • 1
    Ya canna' change the laws of physics...you either buy more hardware to accommodate the storage and speed requirements, or make due with kludges. If the data were important, you spend the extra money and amortize the cost over the expected lifespan of the equipment. Commented Jan 5, 2012 at 13:27

Another option is Bacula. It's free, works on both CentOS (found in yum) and OS X (build from source) and it's a full client-server backup suite. It, however, does not use SSH to transfer but I do believe that encryption methods are available. You can control how long items are kept, when backups run, what is backed up, etc. There is a GUI for it if you install Webmin on the CentOS server and add the Bacula plugins. The GUI allows you to restore how many files you need to and it gives you a directory tree view to select your files.

It's kind of a pain to set up, though. I was really hoping that I could "learn as I go" when I set it up, but that wasn't the case. I actually had to RTFM (I know, right?)


Assuming you would like to keep a full backup of the data on each backup device you may want to consider LBackup, which features an extendable pre / post scripting sub system which allows you to handle bakcups to disk images and even removable media such as an external hard drives / memory sticks.

This page from the LBackup developer section provides a means to detect an attached disk and then mount that disk. By modifying the fstab you may prevent devices from mounting automatically.

Essentially, this setup will allow you to have a number of removable drives which may be rotated and which will automatically be unmounted from the system once the backup finishes. You could even add a post action script to notify you (via email, txt, call, beep, speak etc.) that the backup has completed successfully and that the disk is ready to be taken off site.

Finally, if you are using 3.5" hard disks you could reduce the cost of removable drive enclosures by using something like the RTX110-3Q and protective drive cases from WiebeTech. With that kind of system in place you could just rotate the bare drives which could save you some cash on enclosures (you mentioned you wanted to keep the price down).

Just some thoughts which may be helpful. I initially, wrote the script (listed above) for detecting drives connected via USB. This was because I arranged a backup a friends system which had no internet connection. The backup was performed to a set of rotated USB flash drives. The data was then encrypted using software encryption to the USB sticks should they be lost in transit.

Using the scripting sub system you could add additional checks such as file system integrity checks prior to backup, media integrity checks and even checksums for the snap shots.

Finally, with regards encryption, WiebeTech offers drive enclosures which support hardware encryption.

Disclaimer : I am involved with the development of the free as in free LBackup project.

  • 1
    If you haven't already, you might want to check out the FAQ, particularly the part about being here for the wrong reasons.
    – user11604
    Commented Sep 6, 2012 at 23:17

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .