1
import urllib.request,io
url = 'http://www.image.com/image.jpg'

path = io.BytesIO(urllib.request.urlopen(url).read())

I'd like to check the file size of the URL image in the filestream path before saving, how can i do this?

Also, I don't want to rely on Content-Length headers, I'd like to fetch it into a filestream, check the size and then save

2
  • 1
    Possible duplicate: stackoverflow.com/questions/5909/… Commented Mar 30, 2015 at 7:43
  • Why the need to not rely on Content-Length headers? You can check the size of a BytesIO object the same way you can with any open file object, using seeking to the end and fobj.tell(). But if you use the Content-Length headers you can prevent having to read the whole image into memory first. Commented Mar 30, 2015 at 8:32

3 Answers 3

2

You can get the size of the io.BytesIO() object the same way you can get it for any file object: by seeking to the end and asking for the file position:

path = io.BytesIO(urllib.request.urlopen(url).read())
path.seek(0, 2)  # 0 bytes from the end
size = path.tell()

However, you could just as easily have just taken the len() of the bytestring you just read, before inserting it into an in-memory file object:

data = urllib.request.urlopen(url).read()
size = len(data)
path = io.BytesIO(data)

Note that this means your image has already been loaded into memory. You cannot use this to prevent loading too large an image object. For that using the Content-Length header is the only option.

If the server uses a chunked transfer encoding to facilitate streaming (so no content length has been set up front), you can use a loop limit how much data is read.

0
2

Try importing urllib.request

import urllib.request, io
url = 'http://www.elsecarrailway.co.uk/images/Events/TeddyBear-3.jpg'
path = urllib.request.urlopen(url)
meta = path.info()

>>>meta.get(name="Content-Length")
'269898' # ie  269kb
1
  • But now your answer is functionally no different from llogiq and goes directly against what the OP is asking for. Commented Mar 30, 2015 at 9:19
0

You could ask the server for the content-length information. Using urllib2 (which I hope is available in your python):

req = urllib2.urlopen(url)
meta = req,info()
length_text = meta.getparam("Content-Length")
try:
      length = int(length_text)
except:
      # length unknown, you may need to read
      length = -1

Not the answer you're looking for? Browse other questions tagged or ask your own question.