I have some images (photos) and there are duplicates but no matter how I sort them they are scattered because of resolution and irregular naming.
I tried gm compare
but can't figure out which metric to use or which values would indicate a match.
Heres examples of an image that looks exactly the same but the second one is 2x resolution (better quality):
gm compare -metric MAE "7920068.jpg" "7920034.jpg"
gm compare -metric MSE "7920068.jpg" "7920034.jpg"
gm compare -metric PAE "7920068.jpg" "7920034.jpg"
gm compare -metric PSNR "7920068.jpg" "7920034.jpg"
gm compare -metric RMSE "7920068.jpg" "7920034.jpg"
Image Difference (MeanAbsoluteError):
Normalized Absolute
============ ==========
Red: 0.1751787015 11480.3
Green: 0.1168407563 7657.2
Blue: 0.0029600541 194.0
Total: 0.0983265040 6443.8
Image Difference (MeanSquaredError):
Normalized Absolute
============ ==========
Red: 0.0910979679 5970.1
Green: 0.0274231091 1797.2
Blue: 0.0000203617 1.3
Total: 0.0395138129 2589.5
Image Difference (PeakAbsoluteError):
Normalized Absolute
============ ==========
Red: 1.0000000000 65535.0
Green: 0.7803921569 51143.0
Blue: 0.0784313725 5140.0
Total: 1.0000000000 65535.0
Image Difference (PeakSignalToNoiseRatio):
PSNR
======
Red: 10.40
Green: 15.62
Blue: 46.91
Total: 14.03
Image Difference (RootMeanSquaredError):
Normalized Absolute
============ ==========
Red: 0.3018243991 19780.1
Green: 0.1655992426 10852.5
Blue: 0.0045123979 295.7
Total: 0.1987808163 13027.1
with graphicsmagick identify i found these values
|image a |image a @2x |image b
Red:
Minimum:| 0.00 (0.0000)| 0.00 (0.0000)| 0.00 (0.0000)
Maximum:|255.00 (1.0000)|255.00 (1.0000)|255.00 (1.0000)
Mean: |175.81 (0.6894)|176.00 (0.6902)|117.79 (0.4619)
Std Dev:| 65.59 (0.2572)| 65.73 (0.2577)| 61.55 (0.2414)
Green:
Minimum:| 0.00 (0.0000)| 0.00 (0.0000)| 0.00 (0.0000)
Maximum:|255.00 (1.0000)|255.00 (1.0000)|255.00 (1.0000)
Mean: |161.58 (0.6336)|162.47 (0.6371)| 99.07 (0.3885)
Std Dev:| 71.14 (0.2790)| 71.26 (0.2794)| 64.94 (0.2547)
Blue:
Minimum:| 0.00 (0.0000)| 0.00 (0.0000)| 0.00 (0.0000)
Maximum:|255.00 (1.0000)|255.00 (1.0000)|255.00 (1.0000)
Mean: |153.59 (0.6023)|153.27 (0.6010)|104.50 (0.4098)
Std Dev:| 71.65 (0.2810)| 71.67 (0.2811)| 60.09 (0.2357)
looks like i can use these values to compare, the image a files have very similar values compared to image b, just need to get a good threshold to indicate what might be a match
I'll use these images as an example:
and here's their output:
gm identify -verbose BOSS-1.jpg
Image: BOSS-1.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Geometry: 591x1049
Class: DirectClass
Type: true color
Depth: 8 bits-per-pixel component
Channel Depths:
Red: 8 bits
Green: 8 bits
Blue: 8 bits
Channel Statistics:
Red:
Minimum: 7.00 (0.0275)
Maximum: 255.00 (1.0000)
Mean: 89.97 (0.3528)
Standard Deviation: 79.68 (0.3125)
Green:
Minimum: 11.00 (0.0431)
Maximum: 255.00 (1.0000)
Mean: 108.55 (0.4257)
Standard Deviation: 70.34 (0.2758)
Blue:
Minimum: 8.00 (0.0314)
Maximum: 255.00 (1.0000)
Mean: 126.50 (0.4961)
Standard Deviation: 68.28 (0.2678)
Resolution: 72x72 pixels
Filesize: 129.6Ki
Interlace: No
Orientation: Unknown
Background Color: white
Border Color: #DFDFDF
Matte Color: #BDBDBD
Page geometry: 591x1049+0+0
Compose: Over
Dispose: Undefined
Iterations: 0
Compression: JPEG
JPEG-Quality: 93
JPEG-Colorspace: 2
JPEG-Colorspace-Name: RGB
JPEG-Sampling-factors: 2x2,1x1,1x1
Signature: 06a764225a290be783b0b3b90c72356f71b0032af8f58e88857c33d6e59b8ccc
Profile-EXIF: 74 bytes
Exif Offset: 26
Color Space: 1
Exif Image Width: 591
Exif Image Length: 1049
Tainted: False
Elapsed Time: 0m:0.011805s
Pixels Per Second: 50.1Mi
$ gm identify -verbose BOSS-1-50.jpg
Image: BOSS-1-50.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Geometry: 296x525
Class: DirectClass
Type: true color
Depth: 8 bits-per-pixel component
Channel Depths:
Red: 8 bits
Green: 8 bits
Blue: 8 bits
Channel Statistics:
Red:
Minimum: 7.00 (0.0275)
Maximum: 255.00 (1.0000)
Mean: 89.34 (0.3504)
Standard Deviation: 78.83 (0.3091)
Green:
Minimum: 12.00 (0.0471)
Maximum: 255.00 (1.0000)
Mean: 107.87 (0.4230)
Standard Deviation: 70.29 (0.2756)
Blue:
Minimum: 14.00 (0.0549)
Maximum: 255.00 (1.0000)
Mean: 125.77 (0.4932)
Standard Deviation: 68.19 (0.2674)
Resolution: 72x72 pixels
Filesize: 44.2Ki
Interlace: No
Orientation: Unknown
Background Color: white
Border Color: #DFDFDF
Matte Color: #BDBDBD
Page geometry: 296x525+0+0
Compose: Over
Dispose: Undefined
Iterations: 0
Compression: JPEG
JPEG-Quality: 93
JPEG-Colorspace: 2
JPEG-Colorspace-Name: RGB
JPEG-Sampling-factors: 2x2,1x1,1x1
Signature: 2c12437d162d8bf92ad49497e2644ca3a5edd9d3c8947d44445a5923565123cc
Profile-EXIF: 74 bytes
Exif Offset: 26
Color Space: 1
Exif Image Width: 296
Exif Image Length: 525
Tainted: False
Elapsed Time: 0m:0.002051s
Pixels Per Second: 72.3Mi
$ gm identify -verbose BOSS-8.jpg
Image: BOSS-8.jpg
Format: JPEG (Joint Photographic Experts Group JFIF format)
Geometry: 584x1050
Class: DirectClass
Type: true color
Depth: 8 bits-per-pixel component
Channel Depths:
Red: 8 bits
Green: 8 bits
Blue: 8 bits
Channel Statistics:
Red:
Minimum: 0.00 (0.0000)
Maximum: 255.00 (1.0000)
Mean: 91.51 (0.3589)
Standard Deviation: 85.21 (0.3341)
Green:
Minimum: 0.00 (0.0000)
Maximum: 255.00 (1.0000)
Mean: 110.18 (0.4321)
Standard Deviation: 83.58 (0.3278)
Blue:
Minimum: 0.00 (0.0000)
Maximum: 255.00 (1.0000)
Mean: 132.97 (0.5214)
Standard Deviation: 87.69 (0.3439)
Resolution: 72x72 pixels
Filesize: 180.5Ki
Interlace: No
Orientation: Unknown
Background Color: white
Border Color: #DFDFDF
Matte Color: #BDBDBD
Page geometry: 584x1050+0+0
Compose: Over
Dispose: Undefined
Iterations: 0
Compression: JPEG
JPEG-Quality: 93
JPEG-Colorspace: 2
JPEG-Colorspace-Name: RGB
JPEG-Sampling-factors: 2x2,1x1,1x1
Signature: 9d12ad4d93d1c8d219d41ef9755984bcb151a8de502c70279aea4b69202c99d1
Profile-EXIF: 74 bytes
Exif Offset: 26
Color Space: 1
Exif Image Width: 584
Exif Image Length: 1050
Tainted: False
Elapsed Time: 0m:0.016498s
Pixels Per Second: 35.4Mi