2

I want to get the size of an http://.. file before I download it. I don't know how to use http request.

Thanks!

4 Answers 4

13
import urllib2
f = urllib2.urlopen("http://your-url")
size= f.headers["Content-Length"]
print size
2
  • 5
    HTTP HEAD is better option. So you don't need download the payload. Commented Aug 30, 2010 at 21:20
  • but using requests module if I retrieve the file size is different from the one got from urlOpen, see: >>> requests.head(url).headers.get('content-length', None) '8176' >>> urllib.urlopen(url).info()['content-length'] '38227' >>> len(requests.get(url).content) 38274 Commented Jul 5, 2014 at 9:46
12

The HTTP HEAD method was invented for scenarios like this (wanting to know data about a response without fetching the response itself). Provided the server returns a Content-Length header (and supports HEAD), then you can find out the size of the file (in octets) by looking at the Content-Length returned.

5

Here the complete answer:

import urllib2
f = urllib2.urlopen ("http://your-url")
if "Content-Length" in f.headers:
    size = int (f.headers["Content-Length"])
else:
    size = len (f.read ());
print size
1
  • Not so hard to write, but that's the neat answer. +1 . Welcome on stackoverflow !
    – eyquem
    Commented Apr 27, 2011 at 11:23
4

Not all pages have a content-length header. In that case, the only option is to read the whole page:

len(urllib2.urlopen('http://www.google.com').read());
1
  • Greate! Most of the commercial website has no content-length header!!
    – harryz
    Commented Dec 28, 2013 at 14:09

Not the answer you're looking for? Browse other questions tagged or ask your own question.