Let's assume two TIFF files which are probably the same except their tags (according to a Web search) “MSPropertySetStorage” / “OLE Property Set Storage”:
$ ls -l f1.tif f2.tif | cut -d ' ' -f 5,11
2211838 f1.tif
2211838 f2.tif
$ tiffcmp f1.tif f2.tif
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
Directory 1:
Directory 2:
Directory 3:
Directory 4:
Directory 5:
$ tiffcmp -l f1.tif f2.tif
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
Directory 1:
Directory 2:
Directory 3:
Directory 4:
Directory 5:
$ tiffcmp -t f1.tif f2.tif
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
Directory 1:
Directory 2:
Directory 3:
Directory 4:
Directory 5:
$ diff -a f1.tif f2.tif | wc -c
180436
$ tiff2pdf -o f1.pdf f1.tif
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
$ tiff2pdf -o f2.pdf f2.tif
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 37680 (0x9330) encountered.
$ diff -a f1.pdf f2.pdf
11,12c11,12
< /CreationDate (D:20240428152134)
< /ModDate (D:20240428152134)
---
> /CreationDate (D:20240428152145)
> /ModDate (D:20240428152145)
The output of exiftool
delivers a bit more but not much more:
$ diff <(exiftool f1.tif) <(exiftool f2.tif)
2c2
< File Name : f1.tif
---
> File Name : f2.tif
5c5
< File Modification Date/Time : …
---
> File Modification Date/Time : …
$ exiftool f1.tif
ExifTool Version Number : 12.57
File Name : f1.tif
Directory : .
File Size : 2.2 MB
File Modification Date/Time : …
File Access Date/Time : 2024:04:29 09:46:03+02:00
File Inode Change Date/Time : 2024:04:29 09:42:41+02:00
File Permissions : -rw-r--r--
File Type : TIFF
File Type Extension : tif
MIME Type : image/tiff
Exif Byte Order : Little-endian (Intel, II)
Image Width : 1700
Image Height : 2338
Bits Per Sample : 8
Compression : LZW
Photometric Interpretation : RGB Palette
Samples Per Pixel : 1
Rows Per Strip : 7
X Resolution : 96
Y Resolution : 96
MS Property Set Storage : (Binary data 3072 bytes, use -b option to extract)
Predictor : None
Color Map : (Binary data 1536 bytes, use -b option to extract)
Subfile Type : Full-resolution image
Strip Offsets : (Binary data 335 bytes, use -b option to extract)
Strip Byte Counts : (Binary data 249 bytes, use -b option to extract)
Page Count : 6
Image Size : 1700x2338
Megapixels : 4.0
$ diff -a <(exiftool -b f1.tif) <(exiftool -b f2.tif) | wc -c
10776
The only meaningful part we see in the output of the command diff -a <(exiftool -b f1.tif) <(exiftool -b f2.tif)
is the name of the user who created the files followed by the grave accent ` (U+0060) and Microsoft Office Document Imaging 1.03.2349.01
.
Looking at the six pages of f1.tif in the viewer evince
to the side of the six pages of f2.tif in evince
, we notice no difference, but the files are (potentially post-processed) scans of typed and handwritten small text, so all bets are off. Further, evince
might have ignored the tags as well.
How to get the contents of the fields with the aforementioned specific tag 37680 (0x9330) and compare them between the files so that the comparison result makes sense to a human reader?
How can we meaningfully tell or visualize what the 180436 bytes of the difference
diff -a f1.tif f2.tif
and the 10776 bytes of the differencediff -a <(exiftool -b f1.tif) <(exiftool -b f2.tif)
are about? These commands produce unintelligible garbage.
exiftool
and compare it. It seems37680
is kind of custom/from app tag which is not described in the standard. loc.gov/preservation/digital/formats/content/…exiftool
and their comparison.