0

I am attempting to use the python urllib.request library to download .pdb (protein data bank) files with the full predicted molecular structure of a given protein from the alphafold website. In this example, I am attempting to download a protein with a uniprot ID of Q9BY15. The entry https://alphafold.ebi.ac.uk/entry/Q9BY15 contains a download link to the pdb file of the protein as shown below;

enter image description here

And the manually downloaded file has the following naming format;

enter image description here

Here is the block of code I am using (in its simplest form)

import os
import urllib
import urllib.request

url = 'https://alphafold.ebi.ac.uk/entry/'
prot = 'Q9BY15'
alphaname = 'AF-' + prot + '-F1-model_v2.pdb'
urllib.request.urlretrieve(url + prot, alphaname)

And here is the file that I get when I run the code;

enter image description here

As you can see, the file is far smaller than the actual size of the real file (despite having the exact same name), and is effectively empty when viewing it through protein identification programs. How would I rewrite this code to pull the actual file?

1 Answer 1

2

I'm not sure if this will solve your problem but the correct url for downloading the pdb file of Q9BY15 is https://alphafold.ebi.ac.uk/files/AF-Q9BY15-F1-model_v2.pdb

Try replacing /entry/ in the link with /files/.

1

Not the answer you're looking for? Browse other questions tagged or ask your own question.