40

There is a JSON API for PyPI which allows getting data for packages:

http://pypi.python.org/pypi/<package_name>/json
http://pypi.python.org/pypi/<package_name>/<version>/json

However, is it possible to get a list of all PyPI packages (or, for example, recent ones) with a GET call?

4
  • Is Index of Packages the webpage you are looking for?
    – vaibhaw
    Commented Apr 9, 2014 at 9:15
  • @vaibhaw No, it's not json. It has the data I need, but has some overhead for getting and parsing it. Commented Apr 9, 2014 at 11:19
  • True, it's not json. I thought you were looking for a list of all packages.
    – vaibhaw
    Commented Apr 9, 2014 at 11:38
  • Any way to search PyPI by a package prefix or fragment (e.g. lxm -> lxml, lxml-wrapper, ...) via the simple / JSON APIs? The XML-RPC API offers a search, but apparently it is being deprecated :( Commented Mar 31, 2019 at 1:46

7 Answers 7

31

The easiest way to do this is to use the simple index at PyPI which lists all packages without overhead. You can then request the JSON of each package individually by performing a GET request to the URLs mentioned in your question.

1
18

I know that you asked for a way to do this from the JSON API, but you can use the XML-RPC api to get this info very easily, without having to parse HTML.

try:
     import xmlrpclib
except ImportError:
     import xmlrpc.client as xmlrpclib

client = xmlrpclib.ServerProxy('https://pypi.python.org/pypi')
# get a list of package names
packages = client.list_packages()
6
  • 10
    Since 2017-04, the top of that page says: The XMLRPC interface for PyPI is considered legacy and should not be used..
    – Anthon
    Commented Jun 11, 2017 at 8:14
  • This worked for me - python version 3.6.6 - Date 1/17/2019.
    – R4444
    Commented Jan 17, 2019 at 6:24
  • For package releases you can use - client.package_releases
    – R4444
    Commented Jan 17, 2019 at 6:26
  • 1
    This still seems to be working as of 5/11/2021.
    – Gourneau
    Commented May 12, 2021 at 4:11
  • 1
    slow. 4x slower than https://pypi.org/simple/
    – milahu
    Commented Mar 9, 2022 at 13:49
9

As of PEP 691, you can now grab this through the Simple API if you request a JSON response.

curl --header 'Accept: application/vnd.pypi.simple.v1+json' https://pypi.org/simple/ | jq
4

I tried this answer, but it's not working on Python 3.6

I found one solution with HTML parsing by using lxml package, But you have to install it via pip command as

pip install lxml


Then, try the following snippet

from lxml import html
import requests

response = requests.get("https://pypi.org/simple/")

tree = html.fromstring(response.content)

package_list = [package for package in tree.xpath('//a/text()')]
3
  • 1
    I would rather use defusedxml for externally-fetched XML files Commented Aug 4, 2021 at 12:18
  • @AlexanderShishenko How would you do that?
    – not2qubit
    Commented Dec 27, 2021 at 19:05
  • i would rather use for match in re.finditer(r'"/simple/([^/]+)/"', html) to parse this simple html
    – milahu
    Commented Mar 9, 2022 at 14:13
4

NOTE: To make tasks like this simple I've implemented an own Python module. It can be installed using pip:

pip install jk_pypiorgapi

The module is very simple to use. After instantiating an object representing the API interface you can make use of it:

import jk_pypiorgapi

api = jk_pypiorgapi.PyPiOrgAPI()
n = len(api.listAllPackages())
print("Number of packages on pypi.org:", n)

This module also provides capabilities for downloading information about specific packages as provided by pypi.org:

import jk_pypiorgapi
import jk_json

api = jk_pypiorgapi.PyPiOrgAPI()
jData = api.getPackageInfoJSON("jk_pypiorgapi")
jk_json.prettyPrint(jData)

This feature might be helpful as well.

7
  • 1
    Thanks for this! I had to install some undeclared dependencies to get it working: pypine, jk_cmdoutputparsinghelper, invoke, jk_version. I took a look at your pretty printer as well. Very nice!
    – Roger Dahl
    Commented Apr 14, 2021 at 2:10
  • Thanks for the comment, I will provide an update soon. BTW: PyPine is a new project I'm working on right now: A build and data processing framework in Python that will be open source soon. But jk_pyppiorgapi should not have a dependency for that. I'll look into that soon. If you encounter any issues, please file a bug report on GitHub. Thanks!
    – Regis May
    Commented Apr 14, 2021 at 10:12
  • @RogerDahl Fixed. You might want to update. However, I'm a bit confused: There should be no requirement for jk_cmdoutputparsinghelper and jk_version as both modules are not used by jk_pypiorgapi. If you want to help please check this again after installing the update and if this dependency still exists please file a bug report on the GitHub repo page. Thank you!
    – Regis May
    Commented Apr 14, 2021 at 10:28
  • Just do a fresh venv, install jk_pypiorgapi, and try the snippets. You should get the missing deps that I did. A was actually trying to find a way to query PyPI for dependency information, and later found out that the information is not exposed by the current PyPI API.
    – Roger Dahl
    Commented Apr 14, 2021 at 19:28
  • @RogerDahl: It's not that easy as you think. As I am building system tools as well I've installed many self written packages on system level. I need to wipe half of the system to test this. Therefore I had just a look at all packages directly: In this case there are not that much files, so this is quite easy, and I eliminated the dependency to pypine. The other dependencies should not be used anyway, neither by jk_pipyorgapi, nor by its dependencies. However, I will set up a special testing machine soon to eliminate any future inconveniences regarding dependencies.
    – Regis May
    Commented Apr 14, 2021 at 20:01
2

This is now possible entirely within requests. The requested content type (the mime type for JSON) just needs to go into the dictionary of headers. `requests' can even decode the json into another dict for you:

r = requests.get(f'https://pypi.org/pypi/{package_name}/json', headers = {'Accept': 'application/json'});

info = r.json()['info']
print(f"requested package name = {package_name}, stored name: {info['name']}, author: {info['author']}, version: {info['version']}, license: {info['license']}")
-3

Here's Bash one-liner:

curl -sG -H 'Host: pypi.org' -H 'Accept: application/json' https://pypi.org/pypi/numpy/json | awk -F "description\":\"" '{ print $2 }' |cut -d ',' -f 1

# NumPy is a general-purpose array-processing package designed to...
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.