classification
Title: urllib2 does not catch httplib.BadStatusLine
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: urllib.request.urlopen leaks exceptions from socket and httplib.client
View: 13736
Assigned To: orsenthil Nosy List: adamnelson, groodt, martin.panter, orsenthil
Priority: normal Keywords: patch

Created on 2010-05-26 13:44 by adamnelson, last changed 2015-11-16 22:09 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
bad_status_urlerror.diff groodt, 2012-07-07 14:05 review
Messages (6)
msg106526 - (view) Author: AdamN (adamnelson) Date: 2010-05-26 13:44
When running urllib2 and getting a BadStatus from an http server, this error is raised:

  File "/var/www/pinax-env/pline/apps/page/models.py", line 303, in render
    content = urllib2.urlopen(self.url,timeout=10).read()
  File "/usr/lib/python2.6/urllib2.py", line 124, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 389, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 407, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 367, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1146, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.6/urllib2.py", line 1119, in do_open
    r = h.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 974, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 355, in _read_status
    raise BadStatusLine(line)
httplib.BadStatusLine

Because urllib2 doesn't catch it with this:

lines 1116-1120

try:
    h.request(req.get_method(), req.get_selector(), req.data, headers)
    r = h.getresponse()
except socket.error, err: # XXX what error?
    raise URLError(err)

It is not caught anywhere else and the call blows up unless I make a special exception for all httplib exceptions.  The specific url that blew this up is:

http://phoenix.untd.com/oasx/rqst/type=sx/rdb=8203014d740000555355533a415a2d2d2d2d2d2d2d2d2d2d0100001d494a0901000000000000/version=3/origin=uol/isp=et_cau
msg106539 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-05-26 16:00
urllib2 is currently catching the socket.error exceptions and presenting the reason as custom URLError exception. To address this issue, the module should catch the httplib raised exceptions also and present it.
msg164851 - (view) Author: Greg Roodt (groodt) * Date: 2012-07-07 14:05
I've made a small change to urllib2 to catch the httplib.BadStatusLine and raise as URLError. This exception should rarely happen as it means the server is returning invalid responses. Nevertheless, I've added a test and hopefully fixed the issue.

Patch is attached. I will check if this needs to be added to newer versions of Python.
msg165255 - (view) Author: Greg Roodt (groodt) * Date: 2012-07-11 13:13
By the way, the issue can be recreated by running the following:

netcat -l -p 9999 -e "echo HTTP/1.1 1000 OK" &
python -c "import urllib2; urllib2.urlopen('http://localhost:9999')"

This happens on 2.6, 2.7 and 3 by the looks of it.
msg240471 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-04-11 11:52
See also Issue 22797, about documenting that non-URLError exceptions may be raised.
msg254760 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-16 22:09
Issue 13736 also proposes to wrap HTTP client exceptions, although I personally don’t really endorse this.

The other option is to fix the documentation: Issue 25633, Issue 22797.
History
Date User Action Args
2015-11-16 22:09:16martin.pantersetstatus: open -> closed
superseder: urllib.request.urlopen leaks exceptions from socket and httplib.client
resolution: accepted -> duplicate
messages: + msg254760
2015-04-11 11:52:05martin.pantersetnosy: + martin.panter
messages: + msg240471
2012-07-11 13:13:42groodtsetmessages: + msg165255
2012-07-07 14:05:58groodtsetfiles: + bad_status_urlerror.diff

nosy: + groodt
messages: + msg164851

keywords: + patch
2010-05-26 16:00:30orsenthilsetnosy: + orsenthil
messages: + msg106539

assignee: orsenthil
resolution: accepted
2010-05-26 13:44:47adamnelsonsetcomponents: + Library (Lib)
2010-05-26 13:44:34adamnelsonsetversions: + Python 2.6
2010-05-26 13:44:09adamnelsoncreate