Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pdq] pdq_hasher error for B/W png #1259

Open
thedanielsun opened this issue Feb 6, 2023 · 1 comment
Open

[pdq] pdq_hasher error for B/W png #1259

thedanielsun opened this issue Feb 6, 2023 · 1 comment
Labels
pdq Items related to the pdq libraries or reference implementations

Comments

@thedanielsun
Copy link
Contributor

thedanielsun commented Feb 6, 2023

threatexchange hash photo https://i.redd.it/4shux9eu3mga1.png

  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/signal_base.py", line 195, in hash_from_file
    return cls.hash_from_bytes(file.read_bytes())
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/signal.py", line 77, in hash_from_bytes
    pdq_hash, quality = pdq_from_bytes(bytes_)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/pdq_hasher.py", line 33, in pdq_from_bytes
    return _pdq_from_numpy_array(np_array)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/pdq_hasher.py", line 37, in _pdq_from_numpy_array
    hash_vector, quality = pdqhash.compute(array)
  File "pdqhash/bindings.pyx", line 67, in pdqhash.bindings.compute
IndexError: index 2 is out of bounds for axis 2 with size 2 

Possible root cause:

i’m just looking at the ndarray size which result for the image
threatexchange hash photo https://i.redd.it/4shux9eu3mga1.png

_check_dimension_and_expand_if_needed size
(2048, 2004, 2)
_pdq_from_numpy_array
(2048, 2004, 2)

Pillow version: 9.3.0
https://github.com/facebook/ThreatExchange/blob/main/python-threatexchange/threatexchange/signal_type/pdq/pdq_hasher.py#L50-L56

maybe this not working as intended?
ndim=3 here but maybe B/W conversion is not calculating dimension properly

@Dcallies
Copy link
Contributor

Dcallies commented Feb 7, 2023

Thanks for the report Daniel! The logging output and links really help as well.

@Dcallies Dcallies added the pdq Items related to the pdq libraries or reference implementations label Mar 14, 2023
@Dcallies Dcallies changed the title pdq_hasher error for B/W png Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pdq Items related to the pdq libraries or reference implementations
2 participants