5

I'm trying to figure out a way to use rsync (one or more times) and possibly other commands (such as cp -lr) to accomplish the following:

  1. Synchronize remote folder A to local folder B
  2. I already have a local folder C which is a previous synchronization of A
  3. I want files that are unchanged between C and A to be created in B as a hard-link
  4. I want new files in A to be transfered back to B
  5. I want files that have been deleted in A, not to be hard-linked in B or hard-linked and later deleted.
  6. I want files that have been modified (by appending data) in A copied locally from C to B and to have only the appended bytes transfered and appended to the new copy.

A few constraints that I know to be true that may help finding a solution:

  • There are 2 kinds of files in A:
    1. Immutable ones, which are either created new, or deleted.
    2. Mutable ones, which are always modified by appending data, and can also be deleted.
  • These two kinds of files can be easily distinguished since each group has a fixed prefix so any commands can target either group or both.

My current solution is to use

rsync -av --link-dest C remote:A B

But this has the drawback that appended files are fully transferred increasing the volume by more than 10X.

Any improvements over this solution are welcome, and even better if all transfers are done with rsync.

NOTE: it is OK to use several rounds of rsync to achieve it, lack of atomicity in that sense is not an issue as long as C is not altered.

1 Answer 1

1

Well, I didn't think I was going to be able to accomplish this until I recently discovered a nifty trick you can do with rsync, and since nobody has answered in a while I'll present my solution.

The trick is when you use the following arguments:

rsync --suffix "" --backup-dir "." ...

This causes rsync to backup files before modifying them, but the backups turn out to be in-place, so you're actually making copies of the files before modifying them. This allows you to change files that were hard-linked with out changing the originals.

Then, the sequence to accomplish the desired behavior could be the following:

# locally hard-link the mutable files
rsync -ahv --link-dest C --include-from MUTABLE_FILES.filter C/* B

# copy locally + append remotely changed files 
# (also delete mutable files that disappeared at remote location A)
rsync -ahbv --suffix "" --backup-dir "." --append-verify \
      --include-from MUTABLE_FILES.filter --delete A/* B 

# now hard-link locally + transfer immutable files
rsync -ahv --link-dest C --include-from IMMUTABLE_FILES.filter A/* B

This probably could be solved with the first two steps without using filters but in my particular use case to guarantee coherence in the final destination I need the mutable files transfered before the immutable ones, and the default alphabetic ordering done by rsync does not guarantee this in my case. The reason why I need that is that the mutable files may get deleted and replaced by an immutable file. If I didn't transfer the immutable file because it didn't exist at the moment, but the mutable disappears before I get to it, I'm left with neither and I loose data.

1
  • Sadly, it looks like, rsync has removed this trick: rsync: --suffix cannot be empty when --backup-dir is the same as the dest dir, rsync version 3.1.1 protocol version 31
    – mperrin
    Commented Oct 26, 2017 at 8:35

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .