Message153581
When accessing this URL, both urllib2 (Py2) and urlib.client (Py3) raise an IncompleteRead error.
http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199
Previous discussions about similar errors suggest that this may be due to a problem with the server and chunked data transfer. (See links below.) I can't understand what that means. However, this works fine with urllib (Py2), curl, wget, and all regular web browsers I've tried it with. Thus, I would have expected urllib2 (Py2) and urllib.request (Py3) to cope with it similarly.
Versions I've tested with:
- Fails with urllib2 + Python 2.5.4, 2.6.1, 2.7.2 (Error messages vary.)
- Fails with urllib.request + Python 3.1.2, 3.2.2
- Succeeds with urllib + Python 2.5.4, 2.6.1, 2.7.2
- Succeeds with wget 1.11.1
- Succeeds with curl 7.15.5
___________________________________________________________
TEST CASES
# FAILS - Python 2.7, 2.6, 2.5
import urllib2
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib2.urlopen(url).read() # Raises httplib.IncompleteRead
# FAILS - Python 3.2, 3.1
import urllib.request
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib.request.urlopen(url).read() # Raises http.client.IncompleteRead
# SUCCEEDS - Python 2.7, 2.6, 2.5
import urllib
url = "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199"
xml_str = urllib.urlopen(url).read()
dom = xml.dom.minidom.parseString(xml_str) # Verify XML is complete
print("urllib: %d bytes received and parsed successfully"%len(xml_str))
# SUCCEEDS - wget
wget -O- "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199" | wc
# SUCCEEDS - curl - prints an error, but returns the full data anyway
curl "http://info.kingcounty.gov/health/ehs/foodsafety/inspections/XmlRest.aspx?Zip_Code=98199" | wc
___________________________________________________________
RELATED DISCUSSIONS
http://www.gossamer-threads.com/lists/python/python/847985
http://bugs.python.org/issue11463 (closed)
http://bugs.python.org/issue6785 (closed)
http://bugs.python.org/issue6312 (closed) |
|
Date |
User |
Action |
Args |
2012-02-17 17:36:17 | Alex Quinn | set | recipients:
+ Alex Quinn |
2012-02-17 17:36:17 | Alex Quinn | set | messageid: <1329500177.48.0.42118882394.issue14044@psf.upfronthosting.co.za> |
2012-02-17 17:36:16 | Alex Quinn | link | issue14044 messages |
2012-02-17 17:36:16 | Alex Quinn | create | |
|