0

What flags do I need to give to the find command to enter the directory before running the -exec command?

I have a directory filled with directories filled with files

> root directory
 -> directory1
  -> file1
  -> file2...
 -> directory2
  ->file1
  ->file2
 -> directory...

and I want to place a checksum file in each directory which contains the sums of the directory contents:

> root directory
 -> directory1
  -> file1
  -> file2...
  -> checksums.md5
 -> directory2
  ->file1
  ->file2
  -> checksums.md5
 -> directory...

I've been fiddling around along the lines of

find . -type f -name '*' -exec md5sum {} >> checksums.md5 \;

but it places the checksum file in the root directory (the starting point), plus the file contains the hash of all the files. I have tried using the -execdir flag to no use.

What I'm trying to do is to MD5 the contents of each folder and place the checksum file inside it, then move on to the next folder and repeat.

The contents of the checksums.md5 file should also preferably be sorted.

3 Answers 3

1

You need to perform the append per file.

find ... -execdir bash -c 'md5sum "{}" >> checksum.md5' \;
6
  • I think this solution is broken when directories or file names have spaces in the name. Not to mention the unclosed quote included.
    – galeaby
    Commented Jul 16, 2018 at 2:21
  • I'm... still getting used to my keyboard >.> Commented Jul 16, 2018 at 2:22
  • Where could I add a sort command as to have the checksum files in order (not that it matters, just a preference). Thanks
    – galeaby
    Commented Jul 16, 2018 at 2:25
  • Either modify the scriptlet to sort the existing and new checksums, or postprocess with another find command. Commented Jul 16, 2018 at 2:28
  • 1
    Or you could just tell sort to use a different column for the key. Commented Jul 16, 2018 at 2:49
0

I just couldn't find appropriate flags to add-in to your find command. So, I used loops and I think definitely a better solution than this exists, even using only find

Traversing to each directory

So, In the command you wrote, file checksum.md5 was created in the current directory (that was root for you). In order to create checksum file in each sub-directory, you have to traverse there, execute command, then get back to current directory. Not exactly but, something like this :

cd directory1; md5sum * >> checksum.md5 ; cd ..;

Using find in Loop

So, I just used lines that you have used and put in a loop that iterates over all subdirectories present in current directory. Here's what I did :

for i in $( ls -d */ );
  do cd $i && find . -type f -name '*' -exec md5sum {} >> checksum_files.md5 \; && cd ..;
  done; 

What it does is :

  • $i contains list of directories over which loop iterates.
  • cd $i changes current directory to some sub-directory in the list.
  • Then find command you wrote already.
  • cd .. Traverse back to current directory (root in this case).
  • I used && in between cd and find to explicitly put condition that If it goes to sub-directory only then execute find command.

One-line for copy-paste

for i in $( ls -d */ ); do cd $i && find . -type f -name '*' -exec md5sum {} >> checksum_files.md5 \; && cd ..; done; 

I think there is scope of improvement for this solution and Any such are welcomed. Feel free to add-in more details.

0

To build off Ignacio Vazquez-Abrams answer I made the following bash function which crawls the sub directories MD5ing the contents, places the MD5 file in the sub directory, then post processes the checksum files to sort them so that the resulting checksums are in order based on the filename:

function md5dirs () {
  find . -type f -name '*' -execdir bash -c 'md5sum "{}" >> checksums.md5' \;
  find . -type f -name 'checksums.md5' -execdir bash -c 'sort -k 2 "{}" -o checksums.md5' \;
}

Changing the first find's -name parameter to include wild carded extensions such as *.jpg will make the find command only MD5 specific files in the directories. By default with * it hashes all files in a directory.

Perhaps making the first find's -name parameter a passed value would be better for some people, but most people will be hashing the entire contents of the folder rather than just a subset of files.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .