Skip to main content
deleted 8 characters in body
Source Link

I need to compare two large directories with a lot of files in them. I tried using:

diff -r Directory1 Directory2

but unfortunately the process is really slow due to the amount of files and their huge size.

So I thought about making the process faster by just comparing the content of the folders and not the actual content of the files. Is there a way to make diff recursively check only if every subdirectory of Directory1 and Directory2 match in name and file content, but not check if every single file in Directory1 actually matches every single file in Directory2?

For example, let's say I have Directory1"Directory1/SubDirectory1SubDirectory1" and Directory2"Directory2/Subdirectory1Subdirectory1". I want to check only if Directory1"Directory1/SubDirectory1.1SubDirectory1" and Directory2"Directory2/Subdirectory2.1Subdirectory1" have the same number of files with the same filenames (let's say, file1, file2, ... fileN), but I don't care about matching every file1, file2 ... fileN of Directory1"Directory1/SubDirectory1.1SubDirectory1" to every file1, file2 ... fileN of SubDirectory2.1"Directory2/SubDirectory1" to see if their content is actually the same. Is there a way of doing this?

Edit:

I tried using:

diff <(path1) <(path2)

but unfortunately, diff outputs the full path for each file. The output I get is thus:

< /Volume1/.../.../Directory1/SubDirectory1.1/file1
< /Volume1/.../.../Directory1/SubDirectory1.1/file2
  ...
> /Volume2/.../.../Directory2/SubDirectory2.1SubDirectory1/file1
> /Volume2/.../.../Directory2/SubDirectory2.1SubDirectory1/file2
  ...

Here every single filename clearly differs, because the full paths differ. Is there a way to force find to output paths only starting from the directory you give as argument? For example:

find -(some option I'm not aware of) /Volume1/.../.../Directory1

outputs:

/Directory1/SubDirectory1.1/file1
/Directory1/SubDirectory1.1/file2
...

I need to compare two large directories with a lot of files in them. I tried using:

diff -r Directory1 Directory2

but unfortunately the process is really slow due to the amount of files and their huge size.

So I thought about making the process faster by just comparing the content of the folders and not the actual content of the files. Is there a way to make diff recursively check only if every subdirectory of Directory1 and Directory2 match in name and file content, but not check if every single file in Directory1 actually matches every single file in Directory2?

For example, let's say I have Directory1/SubDirectory1 and Directory2/Subdirectory1. I want to check only if Directory1/SubDirectory1.1 and Directory2/Subdirectory2.1 have the same number of files with the same filenames (let's say, file1, file2, ... fileN), but I don't care about matching every file1, file2 ... fileN of Directory1/SubDirectory1.1 to every file1, file2 ... fileN of SubDirectory2.1 to see if their content is actually the same. Is there a way of doing this?

Edit:

I tried using:

diff <(path1) <(path2)

but unfortunately, diff outputs the full path for each file. The output I get is thus:

< /Volume1/.../.../Directory1/SubDirectory1.1/file1
< /Volume1/.../.../Directory1/SubDirectory1.1/file2
  ...
> /Volume2/.../.../Directory2/SubDirectory2.1/file1
> /Volume2/.../.../Directory2/SubDirectory2.1/file2
  ...

Here every single filename clearly differs, because the full paths differ. Is there a way to force find to output paths only starting from the directory you give as argument? For example:

find -(some option I'm not aware of) /Volume1/.../.../Directory1

outputs:

/Directory1/SubDirectory1.1/file1
/Directory1/SubDirectory1.1/file2
...

I need to compare two large directories with a lot of files in them. I tried using:

diff -r Directory1 Directory2

but unfortunately the process is really slow due to the amount of files and their huge size.

So I thought about making the process faster by just comparing the content of the folders and not the actual content of the files. Is there a way to make diff recursively check only if every subdirectory of Directory1 and Directory2 match in name and file content, but not check if every single file in Directory1 actually matches every single file in Directory2?

For example, let's say I have "Directory1/SubDirectory1" and "Directory2/Subdirectory1". I want to check only if "Directory1/SubDirectory1" and "Directory2/Subdirectory1" have the same number of files with the same filenames (let's say, file1, file2, ... fileN), but I don't care about matching every file1, file2 ... fileN of "Directory1/SubDirectory1" to every file1, file2 ... fileN of "Directory2/SubDirectory1" to see if their content is actually the same. Is there a way of doing this?

Edit:

I tried using:

diff <(path1) <(path2)

but unfortunately, diff outputs the full path for each file. The output I get is thus:

< /Volume1/.../.../Directory1/SubDirectory1/file1
< /Volume1/.../.../Directory1/SubDirectory1/file2
  ...
> /Volume2/.../.../Directory2/SubDirectory1/file1
> /Volume2/.../.../Directory2/SubDirectory1/file2
  ...

Here every single filename clearly differs, because the full paths differ. Is there a way to force find to output paths only starting from the directory you give as argument? For example:

find -(some option I'm not aware of) /Volume1/.../.../Directory1

outputs:

/Directory1/SubDirectory1/file1
/Directory1/SubDirectory1/file2
...
Source Link

Is there a way to make Unix diff -r compare only differences in filenames, but not check if any single file actually differs?

I need to compare two large directories with a lot of files in them. I tried using:

diff -r Directory1 Directory2

but unfortunately the process is really slow due to the amount of files and their huge size.

So I thought about making the process faster by just comparing the content of the folders and not the actual content of the files. Is there a way to make diff recursively check only if every subdirectory of Directory1 and Directory2 match in name and file content, but not check if every single file in Directory1 actually matches every single file in Directory2?

For example, let's say I have Directory1/SubDirectory1 and Directory2/Subdirectory1. I want to check only if Directory1/SubDirectory1.1 and Directory2/Subdirectory2.1 have the same number of files with the same filenames (let's say, file1, file2, ... fileN), but I don't care about matching every file1, file2 ... fileN of Directory1/SubDirectory1.1 to every file1, file2 ... fileN of SubDirectory2.1 to see if their content is actually the same. Is there a way of doing this?

Edit:

I tried using:

diff <(path1) <(path2)

but unfortunately, diff outputs the full path for each file. The output I get is thus:

< /Volume1/.../.../Directory1/SubDirectory1.1/file1
< /Volume1/.../.../Directory1/SubDirectory1.1/file2
  ...
> /Volume2/.../.../Directory2/SubDirectory2.1/file1
> /Volume2/.../.../Directory2/SubDirectory2.1/file2
  ...

Here every single filename clearly differs, because the full paths differ. Is there a way to force find to output paths only starting from the directory you give as argument? For example:

find -(some option I'm not aware of) /Volume1/.../.../Directory1

outputs:

/Directory1/SubDirectory1.1/file1
/Directory1/SubDirectory1.1/file2
...