Using urllib.urlretrieve to download files over HTTP not working

Question

I'm still working on my mp3 downloader but now I'm having trouble with the files being downloaded. I have two versions of the part that's tripping me up. The first gives me a proper file but causes an error. The second gives me a file that is way too small but no error. I've tried opening the file in binary mode but that didn't help. I'm pretty new to doing any work with html so any help would be apprecitaed.

import urllib
import urllib2

def milk():
    SongList = []
    SongStrings = []
    SongNames = []
    earmilk = urllib.urlopen("http://www.earmilk.com/category/pop")
    reader = earmilk.read()
    #gets the position of the playlist
    PlaylistPos = reader.find("var newPlaylistTracks = ")
    #finds the number of songs in the playlist
    NumberSongs = reader[reader.find("var newPlaylistIds = " ): PlaylistPos].count(",") + 1
    initPos = PlaylistPos

    #goes though the playlist and records the html address and name of the song

    for song in range(0, NumberSongs):
        songPos = reader[initPos:].find("http:") + initPos
        namePos = reader[songPos:].find("name") + songPos
        namePos += reader[namePos:].find(">")
        nameEndPos = reader[namePos:].find("<") + namePos
        SongStrings.append(reader[songPos: reader[songPos:].find('"') + songPos])
        SongNames.append(reader[namePos + 1: nameEndPos])
        initPos = nameEndPos

    for correction in range(0, NumberSongs):
        SongStrings[correction] = SongStrings[correction].replace('\\/', "/")

    #downloading songs

    fileName = ''.join([a.isalnum() and a or '_' for a in SongNames[0]])
    fileName = fileName.replace("_", " ") + ".mp3"


#         This version writes a file that can be played but gives an error saying: "TypeError: expected a character buffer object"
##    songDL = open(fileName, "wb")
##    songDL.write(urllib.urlretrieve(SongStrings[0], fileName))


#         This version creates the file but it cannot be played (file size is much smaller than it should be)
##    url = urllib.urlretrieve(SongStrings[0], fileName)
##    url = str(url)
##    songDL = open(fileName, "wb")
##    songDL.write(url)


    songDL.close()

    earmilk.close()

alecxe · Accepted Answer · 2014-06-23 10:01:05Z

2

Re-read the documentation for urllib.urlretrieve:

Return a tuple (filename, headers) where filename is the local file name under which the object can be found, and headers is whatever the info() method of the object returned by urlopen() returned (for a remote object, possibly cached).

You appear to be expecting it to return the bytes of the file itself. The point of urlretrieve is that it handles writing to a file for you, and returns the filename it was written to (which will generally be the same thing as your second argument to the function if you provided one).

edited Jun 23, 2014 at 10:01

alecxe

471k123 gold badges1.1k silver badges1.2k bronze badges

answered Dec 15, 2013 at 21:10

Iguananaut

22.8k6 gold badges52 silver badges63 bronze badges

2

By the way, this sort of thing is a great reason for learning to use pdb. Run your function in the Python REPL and when it crashes enter import pdb; pdb.pm() to get a debugger prompt at the point your code crashed. From there you can directly check what functions like urlretrieve are actually returning. That should give you an idea of why various things you're tring to do with the return value are failing.
– Iguananaut
Commented Dec 15, 2013 at 21:13

Add a comment |

Collectives™ on Stack Overflow

Using urllib.urlretrieve to download files over HTTP not working

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
download
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythondownload or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
download
or ask your own question.