python urllib2 download size

Question

iwant to download a file with the urllib2, and meanwhile i want to display a progress bar.. but how can i get the actual downloaded filesize?

my current code is

ul = urllib2.urlopen('www.file.com/blafoo.iso')
data = ul.get_data()

or

open('file.iso', 'w').write(ul.read())

The data is first written to the file, if the whole download is recieved from the website. how can i access the downloaded data size?

Thanks for your help

have you tried urllib.urlretrieve ?
– Inbar Rose
Commented Aug 6, 2012 at 14:47 — Inbar Rose, Commented Aug 6, 2012 at 14:47

jterrace · Accepted Answer · 2012-08-06 16:41:29Z

7

Here's an example of a text progress bar using the awesome requests library and the progressbar library:

import requests
import progressbar

ISO = "http://www.ubuntu.com/start-download?distro=desktop&bits=32&release=lts"
CHUNK_SIZE = 1024 * 1024 # 1MB

r = requests.get(ISO)
total_size = int(r.headers['content-length'])
pbar = progressbar.ProgressBar(maxval=total_size).start()

file_contents = ""
for chunk in r.iter_content(chunk_size=CHUNK_SIZE):
    file_contents += chunk
    pbar.update(len(file_contents))

This is what I see in the console while running:

$ python requests_progress.py
 90% |############################   |

Edit: some notes:

Not all servers provide a content-length header, so in that case, you can't provide a percentage
You might not want to read the whole file in memory if it's big. You can write the chunks to a file, or somewhere else.

edited Aug 6, 2012 at 16:41

answered Aug 6, 2012 at 15:46

jterrace

66.5k22 gold badges161 silver badges205 bronze badges

I like the progress bar library! Reading a whole ISO image into memory isn't a good idea though. Also some additional handling is needed for when the Content-length header is missing (the server isn't required to send it).
– user634175
Commented Aug 6, 2012 at 16:02
added notes about content-length and file in memory
– jterrace
Commented Aug 6, 2012 at 16:41
got it working with a progressbar. case sensitive for the content-length :) Now its quite awesome ! thanks again
– HappyHacking
Commented Aug 6, 2012 at 18:13

Add a comment |

RanRag · Accepted Answer · 2012-08-06 15:31:36Z

4

You can use info function of urllib2 which returns the meta-information of the page and than you can use getheaders to access Content-Length.

For example, let's calculate the download size of Ubuntu 12.04 ISO

>>> info = urllib2.urlopen('http://mirror01.th.ifl.net/releases//precise/ubuntu-12.04-desktop-i386.iso')
>>> size = int(info.info().getheaders("Content-Length")[0])
>>> size/1024/1024
701
>>>

edited Aug 6, 2012 at 15:31

answered Aug 6, 2012 at 15:25

RanRag

49.2k38 gold badges116 silver badges168 bronze badges

Add a comment |

user634175user634175 · Accepted Answer · 2012-08-06 15:16:36Z

1

import urllib2
with open('file.iso', 'wb') as output: # Note binary mode otherwise you'll corrupt the file
    with urllib2.urlopen('www.file.com/blafoo.iso') as ul:
        CHUNK_SIZE = 8192
        bytes_read = 0
        while True:
            data = ul.read(CHUNK_SIZE)
            bytes_read += len(data) # Update progress bar with this value
            output.write(data)
            if len(data) < CHUNK_SIZE: #EOF
                break

answered Aug 6, 2012 at 15:16

user634175

Add a comment |

Collectives™ on Stack Overflow

python urllib2 download size

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
python
download
urllib2
filesize
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged pythondownloadurllib2filesize or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
download
urllib2
filesize
or ask your own question.