Message141039
When testing urllib.request.urlopen in Python 3, I found that it gave empty responses for some sites. In other words, reading from the file-like object gives zero bytes. Python 2.x's urllib2.urlopen did not give this behavior. I isolated the problem down to the following difference:
@@ -1137,8 +1137,6 @@
r = h.getresponse() # an HTTPResponse instance
except socket.error as err:
raise URLError(err)
- finally:
- h.close()
r.url = req.get_full_url()
# This line replaces the .msg attribute of the HTTPResponse
The "finally" clause is absent in urllib2.py but present in Python 3.2's request.py. I think it has something to do with the HTTPConnection being closed before data could be read. Still, it's puzzling because some sites still give expected answers. Please find attached a small test script for "www.wsj.com" for which the response body should be empty without applying the above patch. |
|
Date |
User |
Action |
Args |
2011-07-24 05:12:20 | royliu | set | recipients:
+ royliu |
2011-07-24 05:12:20 | royliu | set | messageid: <1311484340.86.0.612332840201.issue12628@psf.upfronthosting.co.za> |
2011-07-24 05:12:20 | royliu | link | issue12628 messages |
2011-07-24 05:12:19 | royliu | create | |
|