This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author neologix
Recipients andyharrington, neologix
Date 2010-04-04.21:01:21
SpamBayes Score 6.858393e-05
Marked as misclassified No
Message-id <1270414885.23.0.714628903416.issue8035@psf.upfronthosting.co.za>
In-reply-to
Content
Alright, what happens is the following:
- the file you're trying to retrieve is actually redirected, so the server send a HTTP/1.X 302 Moved Temporarily
- in urllib, when we get a redirection, we call redirect_internal:
    def redirect_internal(self, url, fp, errcode, errmsg, headers, data):
        if 'location' in headers:
            newurl = headers['location']
        elif 'uri' in headers:
            newurl = headers['uri']
        else:
            return
        void = fp.read()
        fp.close()
        # In case the server sent a relative URL, join with original:
        newurl = basejoin(self.type + ":" + url, newurl)
        return self.open(newurl)

the fp.read() is there to wait for the remote end to close connection
The problem, in this case, is that with Python 3.1, httplib uses HTTP/1.1 instead of HTTP/1.0 in version 2.6, and with HTTP/1.1 the server doesn't close the connection after sending the redirect (shown by tcpdump).
So, the process remains stuck on fp.read().
Now, in version 3.1, if we simply change Lib/http/client.py:628
from 
class HTTPConnection:

    _http_vsn = 11
    _http_vsn_str = 'HTTP/1.1'

to
class HTTPConnection:

    _http_vsn = 11
    _http_vsn_str = 'HTTP/1.0'

to use HTTP/1.0 instead, the retrieval works fine.

Obviously, this is not a good solution. Since the RFC doesn't seem to require the server to close the connection after sending a redirect, we'd probably better close the connection ourselves.

That's what the attached patch does, it simply removes the call to fp.read() before closing the connection. It also removes this for http_error_default, since if an error occurs, we probably want to close the connection as soon as possible instead of waiting for server to do so.
History
Date User Action Args
2010-04-04 21:01:25neologixsetrecipients: + neologix, andyharrington
2010-04-04 21:01:25neologixsetmessageid: <1270414885.23.0.714628903416.issue8035@psf.upfronthosting.co.za>
2010-04-04 21:01:23neologixlinkissue8035 messages
2010-04-04 21:01:22neologixcreate