5

I'd like to compare folders on two different computers.

I'm using BitTorrent Sync to synchronize currently 260 GBs of data between two computers: a Windows 7 desktop PC and a Windows 8.1 laptop. All of this data resides within a directory hierarchy with a root folder named "stuff".

I have noticed that there is a significant size and file count difference between what the desktop PC reports and what the laptop reports.

Now I want to use a tool to compare the desktop PC's "stuff" directory with the laptop's "stuff" directory in order to pinpoint missing and changed files and folders. The computers cannot access each other's filesystem via a network, but transferring one computer's folder result via USB flash drive to the other computer is fine.

2 Answers 2

3
+50

Even if you cannot compare the files, a solution will be to compare their hashes.

One tool is the free and open-source md5deep :

md5deep is a set of programs to compute MD5, SHA-1, SHA-256, Tiger, or Whirlpool message digests on an arbitrary number of files

md5deep is able to recursive examine an entire directory tree. That is, compute the MD5 for every file in a directory and for every file in every subdirectory.

md5deep can accept a list of known hashes and compare them to a set of input files. The program can display either those input files that match the list of known hashes or those that do not match.

There are many other similar programs. A quick google found :

HashMyFiles
Gizmo Hasher
checksum

Or even see the long list in the wikipedia article Comparison of file verification software.

1
  • md5deep sounds promising. Thank you.
    – Abdull
    Commented Oct 16, 2014 at 7:55
3

@harrymc's hint for using md5deep/hashdeep works well for me. The following provides a way to use hashdeep64 to compare a directory hierarchy between two computers:

# computer A == computer on which a hashlist.txt for all files in someFileHierarchysTopDirectoryOnComputerA is generated
# computer B == computer on which computer A's generated hashlist.txt is used to compare files. Computer B generates a hashcompareresult.txt

# On computer A, create a hashlist.txt for some file hierarchy located in directory someFileHierarchysTopDirectoryOnComputerA. hashlist.txt will be placed in someFileHierarchysTopDirectoryOnComputerA's parent directory.
cd someFileHierarchysTopDirectoryOnComputerA
hashdeep64 -c md5 -r -l -e -vvv * | tee ../hashlist.txt
# this probably will take some time to finish.

# Now copy the generated hashlist.txt onto computer B's "someFileHierarchysTopDirectoryOnComputerB/.." directory. Then on computer B,
cd someFileHierarchysTopDirectoryOnComputerB
hashdeep64 -c md5 -r -l -k ../hashlist.txt -a -e -vvv * | tee ../hashcompareresult.txt
# hashdeep's -w, -W, -x, and -X modes don't seem to report errors on missing and additional files. Therefore using -a mode.
# Above command will have generated a file hashcompareresult.txt in someFileHierarchysTopDirectoryOnComputerB's parent directory.

# Now filter the created hashcompareresult.txt for mismatches:
cat ../hashcompareresult.txt | grep -E ": No match|: Known file not used"
# The resulting output shows files that
# * exist only on computer A, or
# * exist only on computer B, or
# * exist on both computers at the same location but have different MD5 hashes.
# Depending on the use case, above command probably will report some false positive files and directories, e.g. desktop.ini, Thumbs.db, .DS_Store, __MACOSX, .sync, and .SyncArchive .
# It may be adequate to filter out these file system entries, e.g. with
# cat ../hashcompareresult.txt | grep -E ": No match|: Known file not used" | grep -v -E "desktop.ini|Thumbs.db|.DS_Store|__MACOSX|.sync|.SyncArchive"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .