Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py-tx] CLI error opaque for PDQ match with low hash quality #1256

Open
thedanielsun opened this issue Feb 3, 2023 · 1 comment
Open

[py-tx] CLI error opaque for PDQ match with low hash quality #1256

thedanielsun opened this issue Feb 3, 2023 · 1 comment
Labels
bug code quality Issues that make the code better but do not noticeably change the surface or core algorithms. python-threatexchange Items related to the threatexchange python tool / library

Comments

@thedanielsun
Copy link
Contributor

Using this image https://styles.redditmedia.com/t5_17138f/styles/profileBanner_p7ne95txaxfa1.png
(I think it's just black)
image

threatexchange hash photo https://styles.redditmedia.com/t5_17138f/styles/profileBanner_p7ne95txaxfa1.png
has blank output because the hash is under the quality threshold. I kind of get the output, but I wasn't sure if something was broken.

threatexchange match photo https://styles.redditmedia.com/t5_17138f/styles/profileBanner_p7ne95txaxfa1.png
Traceback (most recent call last):
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/bin/threatexchange", line 8, in <module>
    sys.exit(main())
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/cli/main.py", line 319, in main
    inner_main()
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/cli/main.py", line 312, in inner_main
    execute_command(settings, namespace)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/cli/main.py", line 157, in execute_command
    command.execute(settings)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/cli/match_cmd.py", line 202, in execute
    results = _match_file(path, s_type, index)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/cli/match_cmd.py", line 226, in _match_file
    return index.query(s_type.hash_from_file(path))
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/pdq_index.py", line 53, in query
    results = self.index.search_with_distance_in_result(
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/pdq_faiss_matcher.py", line 268, in search_with_distance_in_result
    return super().search_with_distance_in_result(queries, threshhold)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/threatexchange/signal_type/pdq/pdq_faiss_matcher.py", line 127, in search_with_distance_in_result
    limits, similarities, I = self.faiss_index.range_search(qs, threshhold + 1)
  File "/Users/daniel.sun/.pyenv/versions/3.8.12/lib/python3.8/site-packages/faiss/__init__.py", line 603, in replacement_range_search
    assert d * 8 == self.d

threatexchange match throws an opaque FAISS exception when I think we should probably just check hash existence (if "" is the notation for low quality hash)

@Dcallies
Copy link
Contributor

Dcallies commented Feb 3, 2023

Weird, I thought I added an explicit filter so it would just fail to match, but clearly something is borked. Thanks for the flag!

@Dcallies Dcallies added bug python-threatexchange Items related to the threatexchange python tool / library code quality Issues that make the code better but do not noticeably change the surface or core algorithms. labels Feb 3, 2023
@Dcallies Dcallies changed the title CLI error opaque for PDQ match with low hash quality Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug code quality Issues that make the code better but do not noticeably change the surface or core algorithms. python-threatexchange Items related to the threatexchange python tool / library
2 participants