5

I need to synchronize often two large directories (with lots of sub directories and files) between a Windows XP and a Unix Server. (I currently do it with the nice WinSCP, but have tried a bunch of others.)

My problem is that every time I synchronize, the software checks every file to see if it has been updated. That takes about 1 minute.

I dream of a software that would keep track on both systems of which directory was updated, and wouldn't visit a directory unless needed.

Since I usually change just a few files, that should speed up my sync time from 1 minute to at most 1 sec.

Is there any software that does that, free or not?

One solution would be to synchronize both systems to some remote thing like DropBox. There is a number of reasons why I do not want to do that. It slows everything, it costs, and also I do not need my files in any other place.

Thanks.

3
  • I'd like to hear some solutions for this as well. Right now I've got an rsync setup via cygwin, but over the years the folders have come to contain many millions of tiny JPGs and it takes around 24 hours to complete a sync -- even when only when new JPG has been added -- because of all the time spent verifying the existing JPGs haven't changed.
    – Uninspired
    Commented Feb 19, 2011 at 18:39
  • So your issue is not that it checks every file, but actually how long it takes?
    – Daniel Beck
    Commented Feb 19, 2011 at 18:45
  • why doesn't it store hashes of files and then check if the hashes have changed, it should be faster.
    – Jonathan.
    Commented Feb 19, 2011 at 18:53

1 Answer 1

0

Is this a one-directional transfer (i.e. back up new and changed files on the Windows XP side to the Unix server, nothing is making changed from the Unix end)? That might make it much easier to find a solution.

In order to avoid scanning all the existing unchanged files, you'll need something that checks a list of changes. On XP, there's the NTFS Change Journal. Unix/Linux systems have inotify and also journaling filesystems. But finding a single piece of software that does BOTH could be difficult. That's why I'm hoping this is a uni-directional incremental mirror and not a true "sync".

Oh... I should mention that another method is with a filesystem (filter) driver. On Linux the "fuse" framework makes this fairly simple, but this approach is less used because it's so much more complicated than processing the journal.

6
  • It is a true sync. I edit the files both from the unix side (to which I actually connect via a windows folder through a local network) and remotely from windows. Thanks for your input though. Conceptually this is a very simple task of course, and I am always amazed at how difficult it is to implement such things ;-)
    – Manu
    Commented Feb 19, 2011 at 20:41
  • @Emanuele: Like I said, the information is there on both Windows and Unix, it's a matter of finding software which is optimized on both platforms. But now you have some keywords to use when looking for programs/reading feature lists.
    – Ben Voigt
    Commented Feb 19, 2011 at 20:44
  • @Ben: Since I don't manage the unix server, I wonder: Is the information supposed to be on a standard server (as opposed to requiring installing/modifying the filesystem), and is it going to be accessible by me? If so, do you think it would be complicated to write such an app? I'd guess many people would benefit from it. BTW: I appreciate your pointers, if it turns out there isn't anything else, I'll accept your answer.
    – Manu
    Commented Feb 20, 2011 at 0:19
  • @Emanuele: Some of the most common filesystems do journaling by default, but several do not, so you'd have to find out from your provider what the filesystem is. And reading the filesystem journal of a physical partition is probably not possible from an unprivileged account. OTOH inotify should work fine from userspace (just make sure you always have the observer running in the background, unlike the journal, inotify doesn't tell you what happened when you weren't looking), or a filesystem mounted on loopback (instead of a partition) would give you full read access to the journal data.
    – Ben Voigt
    Commented Feb 20, 2011 at 0:31
  • I would love it if you could provide more details about your approach: is there a good place to read about these journaling files, where are they stored, how can I parse them? If it wasn't a true sync, do you know an app? Also, what language would you use to code up such an app?
    – Manu
    Commented Feb 24, 2011 at 14:13

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .