New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ResourceWarning in urllib.request #56342
Comments
In case of error (e.g. timeout error), urllib.request leaves the socket open: import urllib.request as ur
import socket
s = socket.socket()
s.bind(('localhost', 10000))
s.listen(0)
socket.setdefaulttimeout(5)
ur.urlopen('http://localhost.localdomain:10000') outputs: sys:1: ResourceWarning: unclosed <socket.socket object, fd=4, family=2, type=1, proto=6> Traceback (most recent call last):
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1146, in do_open
r = h.getresponse() # an HTTPResponse instance
File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 1046, in getresponse
response.begin()
File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 346, in begin
version, status, reason = self._read_status()
File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 308, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/home/wolf/dev/py/py3k/Lib/socket.py", line 279, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 138, in urlopen
return opener.open(url, data, timeout)
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 369, in open
response = self._open(req, data)
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 387, in _open
'_open', req)
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 347, in _call_chain
result = func(*args)
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1163, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1148, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
>>> AFAIU, when urlopen returns or raises, the socket can be closed, so the attached patch adds a "finally" that calls close() on the HTTPConnection object. The test suite pass (except for a mock that was missing the close method), but I'm not 100% sure that it's always safe to call close(). This ResourceWarning has been exposed by test_packaging. |
Hi Ezio, the connection can be closed via the finally call as you do in the patch. There are times when request object is re-used, but before the connection is made. It may also help to understand how the code in the packaging was invoking it. If you run the whole suite and see the nothing breaks (to ensure that something is not waiting for the socket and trying to close later), go ahead with the patch. |
The packaging test (test_pypi_simple.py:test_uses_mirrors) creates a server and a mirror, starts the mirror only, tries to connect to the server, and then falls back on the mirror when the server raises a timeout error. I haven't checked in detail the urllib tests, but the fact that there are no ResourceWarnings while running test_urllib might mean that this case isn't currently tested. |
Oh, I wrote a similar patch to kill the ResourceWarning of test_pypi_simple (except that I didn't patch test_urllib2). |
New changeset ad6bdfd7dd4b by Victor Stinner in branch '3.2': New changeset 57a98feb508e by Victor Stinner in branch 'default': |
New changeset 18e6ccc332d5 by Victor Stinner in branch '2.7': |
I tested the patch on Python 3.3: the full test suite pass on Linux. I applied your patch on Python 2.7, 3.2 and 3.3, thanks Ezio. |
Yay! |
This patch has introduced some problems for me with Python 3.2.1 (64-bit Arch Linux). The following code: with urllib.request.urlopen(url) as page:
pass raises "ValueError: I/O operation on closed file." exception when url is "http://www.imdb.com/". When I removed "h.close()" (added by this patch) from request.py everything worked as expected. Interestingly other URLs work flawlessly with patched code ("http://www.google.com/" for example). I had no time to further investigate the differences between HTTP responses of "good" and "bad" sites... and I am by no means an HTTP expert :) Should I open a new bug report for this one or is it OK to just leave this comment here? |
I think it's better to open a new issue, thank you. |
I reopen the issue. |
(Oh, I missed Antoine's comment, yes, reopen a new issue) |
Sorry, I've forgotten to post a reference to the new bug: bpo-12576 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: