This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author royliu
Recipients royliu
Date 2011-07-24.05:12:19
SpamBayes Score 0.00031865627
Marked as misclassified No
Message-id <1311484340.86.0.612332840201.issue12628@psf.upfronthosting.co.za>
In-reply-to
Content
When testing urllib.request.urlopen in Python 3, I found that it gave empty responses for some sites. In other words, reading from the file-like object gives zero bytes. Python 2.x's urllib2.urlopen did not give this behavior. I isolated the problem down to the following difference:

@@ -1137,8 +1137,6 @@
             r = h.getresponse()  # an HTTPResponse instance
         except socket.error as err:
             raise URLError(err)
-        finally:
-            h.close()
 
         r.url = req.get_full_url()
         # This line replaces the .msg attribute of the HTTPResponse

The "finally" clause is absent in urllib2.py but present in Python 3.2's request.py. I think it has something to do with the HTTPConnection being closed before data could be read. Still, it's puzzling because some sites still give expected answers. Please find attached a small test script for "www.wsj.com" for which the response body should be empty without applying the above patch.
History
Date User Action Args
2011-07-24 05:12:20royliusetrecipients: + royliu
2011-07-24 05:12:20royliusetmessageid: <1311484340.86.0.612332840201.issue12628@psf.upfronthosting.co.za>
2011-07-24 05:12:20royliulinkissue12628 messages
2011-07-24 05:12:19royliucreate