I am trying to compare two binary files in order to identify one of them.
The first file I have contains the data I am interested in and can use to identify the second. The second file is from a 3rd party which could contain the information (or very similar) from the first file.
The two files can be different sizes (e.g. the first file might be 500KB while the second 4MB). Therefore I have been trying to score how much of the first file is in the second, so that I can say with some certainty it is related or derived from the same source (99% of file1 exists inside file2).
I have tried using cmp -l file1.bin file2.bin | wc -l
but the problem with this is that the areas I am interested in are not aligned.
I have also tried using diff
however it will always they they are different. If I could find the total different bytes I could take this away from the file size to see if the remainder matches my file.
Any help is much appreciated.