0

I have two directories, on being a copy of the other. In my case, each directory is on its own external harddrive, but what I want to do should be universally applicable to two should-be-identical directories. Now I would like to synchronize those two directories. I would like to have those features:

  1. Ideally it would be a two-way synchronization, not just one directory being the master and the other the slave. Meaning, it should be able to say "take this subdirectory from A and but that subdirectory from B". (Does that make sense?)
  2. Before I command to do the synchronisation, I would like to see what the changes are going to be in either directory.
  3. Let's say I have a directory A. For backup purposes I make a copy somewhere else of the directory to directory B (e.g. on another harddrive). What happens a lot is that I have a messy subfolder somewhere e.g. called "archive" where I dumped a lot of files. The backup of this subfolder in directory B is messy too of course. One happy day I clean the files in this "archive" in A - often by putting the files in the right place in the directory tree of A (somewhere else than in the "archive" subfolder). Later I would like to synchronize the whole directory tree A with the backup of it, being B. What would happen with a tool like rsync is that the subfolder "archive" in B is deleted and these files are copied from their correct places from A to their correct places in B. Wouldn't it be sensible instead if the files where just moved within B from "archive" to their correct places, as I have done manually for A before? To see those moves would be great to see before synchronisation.

I am using Linux Kubuntu, both directories are on ext4 partitions.

The question Synchronize two directories on linux pc is similar, but not quite the same.

My third requirement is my most important one / most difficult to satisfy. If you'd please find a solution for that one, I'd be very happy :-).

Please consider that my directory tree is rather large - both in size (~ 4TB) and number of files (somewhere between 100 Million and 1 Billion files). So if I'd use something like git... that might not work, I guess.

2 Answers 2

0

You could try freefilesync to to synchronize two directories. It keep tracks of your directories structure in a small file sync.ffs_db in both of the source and the destination directories, and yes, it can detect moved and renamed files and directories and move the them within the source and the destination directories and save the bandwidth of full re-synchronization.

0

I suggest using unison. Unison is meant to synchronize folder structures both ways including file deletions. I am not aware of a way to force unison to only show what would be transferred. Other than that it works very well for your use case.

Edit: According to https://en.wikipedia.org/wiki/Comparison_of_file_synchronization_software it is possible to recognize renames or moves.

1
  • unison seems to satisfy my first requirement and you wrote you are not sure about the second. Two sets of questions: a) How does unison do it's job? Is it building some sort of complicated index? Is it messing with my original files? Consider that my directory trees are very large, both in Bytes and in number of files. b) What about my third requirement? It was my most important one.
    – Make42
    Commented Mar 12, 2016 at 19:08

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .