classification
Title: Unexpected ConnectionResetError in urllib.request against a valid website
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Tymoteusz.Paul, martin.panter, ned.deily, orsenthil, r.david.murray
Priority: normal Keywords:

Created on 2014-07-01 12:18 by Tymoteusz.Paul, last changed 2015-01-12 01:22 by ned.deily. This issue is now closed.

Messages (4)
msg222024 - (view) Author: Tymoteusz Paul (Tymoteusz.Paul) Date: 2014-07-01 12:18
I've recently ran into a problem with urellib.request.urlopen that it fails against one website (that I've found so far). The website itself is working fine, I can access its content with other libraries like requests, curl and outside of python with telnet, links and so on. But with urllib it fails:

Python 3.4.1 (default, Jul  1 2014, 14:08:25)
[GCC 4.7.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> urllib.request.urlopen("http://www.thomsonlocal.com/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.4/urllib/request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib64/python3.4/urllib/request.py", line 455, in open
    response = self._open(req, data)
  File "/usr/lib64/python3.4/urllib/request.py", line 473, in _open
    '_open', req)
  File "/usr/lib64/python3.4/urllib/request.py", line 433, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.4/urllib/request.py", line 1215, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/usr/lib64/python3.4/urllib/request.py", line 1194, in do_open
    r = h.getresponse()
  File "/usr/lib64/python3.4/http/client.py", line 1172, in getresponse
    response.begin()
  File "/usr/lib64/python3.4/http/client.py", line 351, in begin
    version, status, reason = self._read_status()
  File "/usr/lib64/python3.4/http/client.py", line 313, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib64/python3.4/socket.py", line 371, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer

I've tested it on about 6 different servers, in different parts of the world and all of them seem to be affected. I've tested with with 3.2.5, 3.3.3, 3.4.1 and they are all failed with the same trace.
msg222151 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014-07-03 06:12
It fails with Python 2's urllib2.urlopen as well.
msg233867 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-01-12 01:12
Not a Python bug. The web site seems to be doing this based on the user agent; if you change it, it works:

urlopen(Request("http://www.thomsonlocal.com/", headers={"User-Agent": "https://bugs.python.org/issue21896"}))
msg233869 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015-01-12 01:22
So it does.  Thanks, Martin.
History
Date User Action Args
2015-01-12 01:22:36ned.deilysetstatus: open -> closed
resolution: not a bug
messages: + msg233869

stage: resolved
2015-01-12 01:12:35martin.pantersettype: crash -> behavior

messages: + msg233867
nosy: + martin.panter
2014-07-03 06:12:56ned.deilysetnosy: + orsenthil, ned.deily

messages: + msg222151
versions: + Python 2.7, Python 3.5, - Python 3.2, Python 3.3
2014-07-02 08:01:35Tymoteusz.Paulsetversions: + Python 3.2
2014-07-01 14:26:56r.david.murraysetnosy: + r.david.murray
2014-07-01 12:18:25Tymoteusz.Paulcreate