Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceWarning in urllib.request #56342

Closed
ezio-melotti opened this issue May 21, 2011 · 13 comments
Closed

ResourceWarning in urllib.request #56342

ezio-melotti opened this issue May 21, 2011 · 13 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@ezio-melotti
Copy link
Member

BPO 12133
Nosy @orsenthil, @pitrou, @vstinner, @ezio-melotti, @merwok, @sorcio
Files
  • issue12133.diff: Patch against 3.3 to fix the resource warning.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-07-18.07:41:53.227>
    created_at = <Date 2011-05-21.12:42:35.698>
    labels = ['library', 'performance']
    title = 'ResourceWarning in urllib.request'
    updated_at = <Date 2011-07-18.08:31:54.098>
    user = 'https://github.com/ezio-melotti'

    bugs.python.org fields:

    activity = <Date 2011-07-18.08:31:54.098>
    actor = 'daniel.ugra'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-07-18.07:41:53.227>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2011-05-21.12:42:35.698>
    creator = 'ezio.melotti'
    dependencies = []
    files = ['22047']
    hgrepos = []
    issue_num = 12133
    keywords = ['patch', 'needs review']
    message_count = 13.0
    messages = ['136433', '136436', '136437', '138393', '138500', '138501', '138502', '138532', '140508', '140509', '140565', '140566', '140568']
    nosy_count = 11.0
    nosy_names = ['jhylton', 'orsenthil', 'pitrou', 'vstinner', 'nadeem.vawda', 'ezio.melotti', 'eric.araujo', 'daniel.ugra', 'alexis', 'davide.rizzo', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue12133'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @ezio-melotti
    Copy link
    Member Author

    In case of error (e.g. timeout error), urllib.request leaves the socket open:

    import urllib.request as ur
    import socket
    s = socket.socket()
    s.bind(('localhost', 10000))
    s.listen(0)
    socket.setdefaulttimeout(5)
    ur.urlopen('http://localhost.localdomain:10000')

    outputs:

    sys:1: ResourceWarning: unclosed <socket.socket object, fd=4, family=2, type=1, proto=6>

    Traceback (most recent call last):
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1146, in do_open
        r = h.getresponse()  # an HTTPResponse instance
      File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 1046, in getresponse
        response.begin()
      File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 346, in begin
        version, status, reason = self._read_status()
      File "/home/wolf/dev/py/py3k/Lib/http/client.py", line 308, in _read_status
        line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
      File "/home/wolf/dev/py/py3k/Lib/socket.py", line 279, in readinto
        return self._sock.recv_into(b)
    socket.timeout: timed out
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 138, in urlopen
        return opener.open(url, data, timeout)
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 369, in open
        response = self._open(req, data)
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 387, in _open
        '_open', req)
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 347, in _call_chain
        result = func(*args)
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1163, in http_open
        return self.do_open(http.client.HTTPConnection, req)
      File "/home/wolf/dev/py/py3k/Lib/urllib/request.py", line 1148, in do_open
        raise URLError(err)
    urllib.error.URLError: <urlopen error timed out>
    >>>

    AFAIU, when urlopen returns or raises, the socket can be closed, so the attached patch adds a "finally" that calls close() on the HTTPConnection object. The test suite pass (except for a mock that was missing the close method), but I'm not 100% sure that it's always safe to call close().

    This ResourceWarning has been exposed by test_packaging.

    @ezio-melotti ezio-melotti added stdlib Python modules in the Lib dir performance Performance or resource usage labels May 21, 2011
    @orsenthil
    Copy link
    Member

    Hi Ezio, the connection can be closed via the finally call as you do in the patch. There are times when request object is re-used, but before the connection is made. It may also help to understand how the code in the packaging was invoking it.

    If you run the whole suite and see the nothing breaks (to ensure that something is not waiting for the socket and trying to close later), go ahead with the patch.

    @ezio-melotti
    Copy link
    Member Author

    The packaging test (test_pypi_simple.py:test_uses_mirrors) creates a server and a mirror, starts the mirror only, tries to connect to the server, and then falls back on the mirror when the server raises a timeout error.
    The code in the first message does more or less the same, causing a timeout error and the resource warning.

    I haven't checked in detail the urllib tests, but the fact that there are no ResourceWarnings while running test_urllib might mean that this case isn't currently tested.

    @vstinner
    Copy link
    Member

    Oh, I wrote a similar patch to kill the ResourceWarning of test_pypi_simple (except that I didn't patch test_urllib2).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 17, 2011

    New changeset ad6bdfd7dd4b by Victor Stinner in branch '3.2':
    Issue bpo-12133: fix a ResourceWarning in urllib.request
    http://hg.python.org/cpython/rev/ad6bdfd7dd4b

    New changeset 57a98feb508e by Victor Stinner in branch 'default':
    (Merge 3.2) Issue bpo-12133: fix a ResourceWarning in urllib.request
    http://hg.python.org/cpython/rev/57a98feb508e

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 17, 2011

    New changeset 18e6ccc332d5 by Victor Stinner in branch '2.7':
    Issue bpo-12133: AbstractHTTPHandler.do_open() of urllib.request closes the HTTP
    http://hg.python.org/cpython/rev/18e6ccc332d5

    @vstinner
    Copy link
    Member

    I tested the patch on Python 3.3: the full test suite pass on Linux. I applied your patch on Python 2.7, 3.2 and 3.3, thanks Ezio.

    @merwok
    Copy link
    Member

    merwok commented Jun 17, 2011

    Yay!

    @danielugra
    Copy link
    Mannequin

    danielugra mannequin commented Jul 16, 2011

    This patch has introduced some problems for me with Python 3.2.1 (64-bit Arch Linux).

    The following code:

    with urllib.request.urlopen(url) as page:
        pass

    raises "ValueError: I/O operation on closed file." exception when url is "http://www.imdb.com/".

    When I removed "h.close()" (added by this patch) from request.py everything worked as expected.

    Interestingly other URLs work flawlessly with patched code ("http://www.google.com/" for example).

    I had no time to further investigate the differences between HTTP responses of "good" and "bad" sites... and I am by no means an HTTP expert :)

    Should I open a new bug report for this one or is it OK to just leave this comment here?

    @pitrou
    Copy link
    Member

    pitrou commented Jul 16, 2011

    Should I open a new bug report for this one or is it OK to just leave
    this comment here?

    I think it's better to open a new issue, thank you.

    @vstinner
    Copy link
    Member

    I reopen the issue.

    @vstinner vstinner reopened this Jul 18, 2011
    @vstinner
    Copy link
    Member

    (Oh, I missed Antoine's comment, yes, reopen a new issue)

    @danielugra
    Copy link
    Mannequin

    danielugra mannequin commented Jul 18, 2011

    Sorry, I've forgotten to post a reference to the new bug: bpo-12576

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants