3

I made an rsync incremental backup script for my server that will copy a MySQL database backup and a specified folder path to a remote server. Here's the code on Github.

Code excerpt from lines 53-57:

############### Create most current hand link

echo "Creating most current hard link on backup server $most_recent_backup_link"
ssh $remote_backup_server rm -rf ${most_recent_backup_link}
ssh $remote_backup_server cp -alv ${remote_backup_folder}/backup-${backup_folder_name}/ ${most_recent_backup_link}

I'm having a problem with creating the most current hard links on the backup server (lines 53-57 in the program). Everything works, and rsync only copies about 1-2MB of data. But the hard link copy process uses about 30MB of data. I get a huge laundry list of files that haven't changed and the only ones that have changed are very small in size. Normally this isn't a problem, but when you backup every hour, the backup should be as small as possible.

For example, the last backup I did, rsync transferred 1.3MB. But the backup directory grew 35MB.

Why are the hard links taking up so much hard drive space?

4
  • How many directories? Each directory takes up a certain amount of space.
    – user3463
    Commented Jan 17, 2012 at 2:33
  • I ran find -mindepth 1 -type d | wc -l and found my backup has 1706 total directories.
    – mr_schlomo
    Commented Jan 17, 2012 at 11:48
  • That could take up around 6MB at 4KB per directory, but certainly not 34MB. I'm not sure where the balance is coming from.
    – user3463
    Commented Jan 18, 2012 at 0:10
  • 1
    Maybe the directories contain lots of files, so more than 4kb is needed for the directory data?
    – Wyzard
    Commented Mar 29, 2012 at 5:40

2 Answers 2

1

looking at your code (on git hub) it looks like you are creating one .sql.gz file per backup. even though there are only 1 or 2MB of changes the backup will be a brand new file as far as rsync is concerned so it will de-link the file to create a new one as they are now different.

you will probably want to backup the mysql directories directly (which will involve stopping mysql while you do it) to achieve the space savings you want. If you go down that route you will probably want to look into running a slave server to do the backup from, that way, your database stays up all the time, and only the slave server gets stopped while the backup is performed.

-2

You should look into storeBackup (storeBackup.org). It makes de-duplicated backups using hardlinks and it is very powerful.

It has more features than rsync for making hardlinked backups. For hourly backups you could consider the storeBackup option "lateLinks" which will postpone creating all the hardlinks. You might either do one daily backup with all the hardlinks. (Or you could link up all the postponed backups at a later time if you choose to keep each hourly backup.)

storeBackup also has a feature that will let you decide which backups to keep. For example, you could tell it to keep all hourly backups just for the last 24 hours, and to keep a dailly backup for the last 30 days and to keep the first backup of the month beyond that. This way you will not waste so much space.

2
  • This may (or may not) solve the OP's problem, but it doesn't answer the question. Commented Mar 29, 2012 at 15:05
  • That's literally true. His question was "Why are the hard links taking up so much hard drive space?" and I hold him how he could do the types of backups he wants to do and have the hard links take up less space. So I think I provided a potential solution if not an answer to the strictly literal question.
    – MountainX
    Commented Mar 29, 2012 at 22:06

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .