39

With Debian or Ubuntu packages, there is some quality control. Is PIP similar, or is it a complete free-for-all? Can anyone upload any code they want under any name that they want?

There seem to be some junk packages like https://pypi.python.org/pypi/opencv/0.0.1 which has the same name as a very popular computer vision framework, for example.

0

2 Answers 2

43

No, there are no third-party checks on the code that is uploaded to PyPI (the Python Package Index, which is where pip downloads packages unless explicitly instructed otherwise). The only restriction is that once a package name exists, only the maintainer(s) can upload packages with that name (i.e. you can't submit a malicious upgrade to someone else's package using the same name). It is up to the maintainer to ensure that whatever they make available on PyPI doesn't contain malware, unless they intend for it to be malware, and it is up to each individual developer to be aware of what they are downloading using pip.

This has been exploited in a research project investigating "typosquatting". The researcher uploaded some "simulation malware" (mostly harmless) to PyPI under names that were misspelled versions of popular package names, in order to collect data on how often these misspelled packages were installed. If a black-hat hacker had done the same thing, they could have used much more malicious code.

See also this Security Stack Exchange question on the same topic.

4
  • To summarize, yes, a few instances of malware being present in the library have been detected. More to the point, there is no protection against malware other than the user's own diligence. IMHO meaning that there is no way a user can be absolutely assured of the security of the software they are using. Murphy says, if it can be done, it has already been done a thousand times. Unfortunately, one of the greatest advantages of Python also turned out to be its Achilles heel. Commented Dec 16, 2021 at 22:42
  • Does the Python community not address this problem in any way? So every single thing you use pip for you're supposed to just scan every single line of code?
    – Andrew
    Commented May 12, 2023 at 18:59
  • 1
    @Andrew (1/2) The PyPI administrators will sometimes take down malicious packages when they find out about them, especially if the malicious package seems to be taking advantage of a name similarity. Other parts of the Python community may provide more stringent forms of verification, such as the Anaconda channel mentioned in MWB's comment, and there are also third-party security products that (claim to) scan packages and flag ones which seem suspicious. I'm not personally familiar with those measures, though.
    – David Z
    Commented May 15, 2023 at 1:41
  • 1
    (2/2) In general, it's considered the developer's responsibility to determine, to their satisfaction, that the code they're using is doing what they want it to do. If, for you, being satisfied that the code is doing what you want it to do means examining every single line of code, then yes, you're supposed to scan every single line of code. But if, like most developers, you're willing to put some trust in the community, then you can rely on that to avoid having to audit all the code if a package is very popular or has been reviewed and verified in some manner.
    – David Z
    Commented May 15, 2023 at 1:45
5

To add to the existing answer, 5 years later:

A piece of software that was downloaded 30,000 times from PyPI was in fact malware: It stole credit card numbers and login credentials and injected malicious code on infected machines.

3
  • Does the Python community not address this problem in any way? So every single thing you use pip for you're supposed to just scan every single line of code?
    – Andrew
    Commented May 12, 2023 at 18:59
  • 1
    @Andrew Not to my knowledge. I try to use the default "channel" of Anaconda, instead, personally. I think it's curated, somewhat.
    – MWB
    Commented May 13, 2023 at 6:22
  • See David Z's comments above.
    – Andrew
    Commented May 15, 2023 at 5:14

Not the answer you're looking for? Browse other questions tagged or ask your own question.