classification
Title: ResourceWarning when urlopen() forgets the HTTPConnection object
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: barry, martin.panter, orsenthil, pitrou, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2013-11-08 03:45 by martin.panter, last changed 2016-05-16 18:26 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
urllib-close-test.diff martin.panter, 2013-12-06 12:50
urlopen-sock-close.diff martin.panter, 2013-12-06 12:50
issue19524.diff serhiy.storchaka, 2013-12-18 12:25 review
test+except-close.patch martin.panter, 2014-01-05 13:11 Tests + close if exception, on top of existing fix #12692
test2.patch martin.panter, 2014-07-17 06:47 Test for close after valid & invalid responses review
close3.4.patch martin.panter, 2014-07-17 06:58 Close if exception, for 3.4
Messages (16)
msg202404 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-11-08 03:45
The AbstractHTTPHandler.do_open() method creates a HTTPConnection object but does not save it anywhere. This means a ResourceWarning is eventually triggered, at least when the HTTP server leaves the the connection open.

Demonstration code:

from urllib.request import urlopen
with urlopen("http://localhost") as response: response.read()

When I used the above code, the warning did not trigger until I forced a garbage collection:

import gc; gc.collect()

Output:

__main__:1: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=1, proto=6>

Alternatively, you can add a line to the bottom of the do_open() method:

        r.msg = r.reason
        del h; import gc; gc.collect()  # Add this to force warning
        return r

When the warning happens without forced garbage collection, it tends to happen here:

/usr/lib/python3.3/socket.py:370: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=1, proto=6>
  self._sock = None

I tested by using the “socat” CLI program and supplying a chunked HTTP response, without immediately closing the connection at the server end. Using the Content-length header also seems to trigger the issue.

$ sudo socat -d TCP-LISTEN:www,reuseaddr,crnl READLINE
GET / HTTP/1.1
Accept-Encoding: identity
Host: localhost
Connection: close
User-Agent: Python-urllib/3.3

<HTTP response start>
HTTP/1.1 200 OK
Transfer-encoding: chunked

0

<HTTP response end>

If the server leaves the connection open, it only seems to get closed when Python garbage-collects the socket and closes it. Perhaps the connection should be explicitly closed when the urlopen() response object is closed. But I guess that would require wrapping the HTTPResponse object to add to the close behaviour.
msg205366 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-12-06 12:50
Here are two patches: a test case, and a fix for my issue. They were done against an installed version of Python 3.3.3.

I’m not entirely happy with the fix because it is accessing the private HTTPConnection.sock attribute from the urllib.request module, which seems a bit hacky. Other ideas would be to let HTTPConnection grow a new public method (or just a flag?) for cleaning itself up while leaving the response object still open. Or some kind of wrapper to augment the HTTPResponse.close() method to explicitly close the connection as well, like I originally mentioned.

Also my test case borrows some classes from the “test_urllib” module into the “test_urllib2”. This is the first time I’ve looked closely at the Python test suite; maybe there is a better to do that as well.

A bonus from my fix: it looks like it resolves a couple warnings running “test_urllib2net”.
msg206481 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-17 20:56
I can't reproduce issue. Python just hangs on read() when socat is used as test server. Perhaps I do something wrong.
msg206491 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-12-18 02:32
Thanks for looking at this. Perhaps you weren’t pasting the HTTP response into “socat”. After the six request lines are printed out, I enter the five lines between <HTTP response start> and <HTTP response end>; I didn’t really make this obvious. Otherwise, urlopen() hangs waiting for the response and read() never even gets called.

Here’s a Python-only HTTP server to demonstrate without needing “socat”. Run this in one terminal:

from http.server import HTTPServer, BaseHTTPRequestHandler
class Handler(BaseHTTPRequestHandler):
    protocol_version = "HTTP/1.1"
    def do_GET(self):
        self.send_response(200)
        self.send_header("Transfer-Encoding", "chunked")
        self.end_headers()
        self.wfile.write(b"0\r\n\r\n")
HTTPServer(("", 8000), Handler).serve_forever()

. . . and in another terminal:

from urllib.request import urlopen
with urlopen("http://localhost:8000") as response:
    response.read()
import gc; gc.collect()
msg206508 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-18 12:25
Thank you Martin for clarification. Now I see the problem.

Here is regenerated for Rietveld patch.

Perhaps more safe will be not close socket, but only set the _closed attribute so that it will be closed just after closing SocketIO.

        if h.sock:
            h.sock._closed = True
            h.sock = None

But this is even more tricky.

Yet one approach is to implement __del__() method in the socket.socket class. But this breaks existing tests.
msg206662 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-12-20 03:03
How is it safer to manually set “h.sock._closed”? Playing with the internals of HTTPConnection is one thing, but playing with the internals of the socket object as well does not seem necessary.

Also the ResourceWarning is warning that the socket and connection were closed by the garbage collector at some arbitrary point. I don’t think a new __del__() method is going to help. Sorry to be so negative :)

Related issues:
Issue 18144: FD leak in urllib2 (probably an exact dupe)
Issue 11563: test_urllibnet is triggering a ResourceWarning (bug closed, but real issue was only side-stepped IMO)
msg207056 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2013-12-28 22:36
Just discovered the same fix of manually closing the socket object was already made independently of my patch in the “default” branch! See Issue 12692.

http://hg.python.org/cpython/rev/92656b5df2f2

The main difference is my patch should also close the connection if HTTPConnection.getresponse() fails, which could be of some value. And the regression test could still be useful.
msg207373 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-01-05 13:11
The Issue 12692 fix has been backported to the 3.3 branch, and it fixes this bug. However here is an updated patch (against revision 28337a8fb502 in the “3.3” branch) consisting of two left over bits you might still want to use:

1. My test case

2. Explicitly closing the HTTP connection when an invalid response is received. E.g. urlopen("http://google.com:443") would still leak the connection without this.
msg223257 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-07-16 19:33
LGTM. Do you want to write a test for the case when an invalid response is received?
msg223316 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-07-17 06:58
Added a new test for the invalid response case. Now both tests are included in test2.patch.

I separated the actual fix into a separate close3.4.patch (refreshed for the 3.4 branch). This way it is easier for me to make sure the tests work before applying the fix, but it should be easy to combine the patches again if you prefer it that way.
msg226502 - (view) Author: Roundup Robot (python-dev) Date: 2014-09-06 18:46
New changeset c1fb19907cc4 by Serhiy Storchaka in branch '3.4':
Issue #19524: Fixed resource leak in the HTTP connection when an invalid
http://hg.python.org/cpython/rev/c1fb19907cc4

New changeset 43bf95480c3c by Serhiy Storchaka in branch 'default':
Issue #19524: Fixed resource leak in the HTTP connection when an invalid
http://hg.python.org/cpython/rev/43bf95480c3c
msg226503 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-09-06 18:57
Thank you for your contribution Martin.
msg265681 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-16 08:15
New changeset 19e4e0b7f1bd by Serhiy Storchaka in branch '2.7':
Issue #19524: Port fakehttp() from Py3 c1fb19907cc4 for use in test_urllib2
https://hg.python.org/cpython/rev/19e4e0b7f1bd
msg265687 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-16 08:51
Oh, Roundup Robot reports about a commit made almost 2 years ago!
msg265689 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-16 09:40
Sorry that was actually made by me earlier today. I grafted it from your commit, but I always forget to reset the author and date.
msg265715 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-16 18:26
Interesting. I didn't know that "hg graft" keeps the author and data of the original commit. Maybe this is a Mercurial bug.
History
Date User Action Args
2016-05-16 18:26:27serhiy.storchakasetmessages: + msg265715
2016-05-16 09:40:20martin.pantersetmessages: + msg265689
2016-05-16 08:51:40serhiy.storchakasetmessages: + msg265687
2016-05-16 08:15:08python-devsetmessages: + msg265681
2014-09-06 18:57:39serhiy.storchakasetstatus: open -> closed
type: behavior
messages: + msg226503

resolution: fixed
stage: patch review -> resolved
2014-09-06 18:46:06python-devsetnosy: + python-dev
messages: + msg226502
2014-07-17 06:58:20martin.pantersetfiles: + close3.4.patch

messages: + msg223316
2014-07-17 06:47:28martin.pantersetfiles: + test2.patch
2014-07-16 19:33:57serhiy.storchakasetassignee: serhiy.storchaka
2014-07-16 19:33:18serhiy.storchakasetmessages: + msg223257
versions: + Python 3.5, - Python 3.3
2014-01-05 13:11:34martin.pantersetfiles: + test+except-close.patch

messages: + msg207373
2013-12-28 22:36:02martin.pantersetmessages: + msg207056
2013-12-21 21:30:00serhiy.storchakasetnosy: + pitrou
2013-12-20 03:03:22martin.pantersetmessages: + msg206662
2013-12-18 12:25:44serhiy.storchakasetfiles: + issue19524.diff

messages: + msg206508
2013-12-18 02:32:01martin.pantersetmessages: + msg206491
2013-12-17 20:56:27serhiy.storchakasetversions: + Python 3.4, - Python 3.2
nosy: + orsenthil, serhiy.storchaka

messages: + msg206481

stage: patch review
2013-12-06 12:50:24martin.pantersetfiles: + urlopen-sock-close.diff
2013-12-06 12:50:08martin.pantersetfiles: + urllib-close-test.diff
keywords: + patch
messages: + msg205366
2013-11-08 12:58:10barrysetnosy: + barry
2013-11-08 03:45:17martin.pantercreate