Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise errors from socket.close() #70872

Closed
vadmium opened this issue Apr 1, 2016 · 20 comments
Closed

Raise errors from socket.close() #70872

vadmium opened this issue Apr 1, 2016 · 20 comments
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@vadmium
Copy link
Member

vadmium commented Apr 1, 2016

BPO 26685
Nosy @vstinner, @vadmium, @1st1
PRs
  • [3.6] bpo-30319: socket.close() now ignores ECONNRESET (#2565) #2566
  • Dependencies
  • bpo-21069: test_fileno of test_urllibnet intermittently fails
  • Files
  • socket.close.patch
  • finalize-exception.patch: Report finalization exceptions, unfinished
  • socket.close.v2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-10-19.15:54:39.710>
    created_at = <Date 2016-04-01.11:16:28.291>
    labels = ['extension-modules', 'type-feature']
    title = 'Raise errors from socket.close()'
    updated_at = <Date 2017-07-04.14:46:12.202>
    user = 'https://github.com/vadmium'

    bugs.python.org fields:

    activity = <Date 2017-07-04.14:46:12.202>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-10-19.15:54:39.710>
    closer = 'yselivanov'
    components = ['Extension Modules']
    creation = <Date 2016-04-01.11:16:28.291>
    creator = 'martin.panter'
    dependencies = ['21069']
    files = ['42342', '42360', '42361']
    hgrepos = []
    issue_num = 26685
    keywords = ['patch']
    message_count = 20.0
    messages = ['262735', '262737', '262838', '262839', '263158', '263161', '278946', '278947', '278948', '278949', '278950', '278951', '278952', '278953', '278983', '278984', '278985', '278991', '297656', '297658']
    nosy_count = 4.0
    nosy_names = ['vstinner', 'python-dev', 'martin.panter', 'yselivanov']
    pr_nums = ['2566']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue26685'
    versions = ['Python 3.6']

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 1, 2016

    While looking at a recent failure of test_fileno() in test_urllibnet, I discovered that socket.close() ignores the return value of the close() system call. It looks like it has always been this way: <https://docs.python.org/3/library/socket.html#socket.socket.fileno\>.

    On the other hand, both FileIO.close() and os.close() raise an exception if the underlying close() call fails. So I propose to make socket.close() also raise an exception for any failure indicated by the underlying close() call. The benefit is that a programming error causing EBADF would be more easily noticed.

    @vadmium vadmium added extension-modules C modules in the Modules dir type-feature A feature request or enhancement labels Apr 1, 2016
    @vstinner
    Copy link
    Member

    vstinner commented Apr 1, 2016

    I like the idea :-)

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 4, 2016

    Victor suggested making these errors be reported by the finalize method (garbage collection). I started investigating this, but faced various test suite failures (test_io, test_curses), as well as many new unhandled exceptions printed out during the tests which do not actually trigger failures. Maybe this could become a worthwhile change, but it needs more investigation and wasn’t my original intention. Finalize-exception.patch is my patch so far.

    There are many comments over the place that make me uneasy about this change, so I think it should be investigated separately. E.g. Modules/_io/iobase.c:

    /* Silencing I/O errors is bad, but printing spurious tracebacks is
    equally as bad, and potentially more frequent (because of
    shutdown issues). */

    Lib/_pyio.py:

    # The try/except block is in case this is called at program
    # exit time, when it's possible that globals have already been
    # deleted, and then the close() call might fail. Since
    # there's nothing we can do about such failures and they annoy
    # the end users, we suppress the traceback.

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 4, 2016

    Here is socket.close.v2.patch which hopefully fixes the test for Windows (still untested by me though)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 11, 2016

    New changeset bd665613ed67 by Martin Panter in branch 'default':
    Issue bpo-26685: Raise OSError if closing a socket fails
    https://hg.python.org/cpython/rev/bd665613ed67

    @vadmium
    Copy link
    Member Author

    vadmium commented Apr 11, 2016

    I tweaked the test again for Windows, anticipating that it will raise ENOTSOCK rather than EBADF.

    @vadmium vadmium closed this as completed Apr 11, 2016
    @1st1
    Copy link
    Member

    1st1 commented Oct 18, 2016

    I think this patch should be reverted. It breaks backwards compatibility:

        import socket
    
        sock0 = socket.socket(socket.AF_INET, socket.SOCK_STREAM, 0)
        sock = socket.socket(
            socket.AF_INET, socket.SOCK_STREAM, 0, sock0.fileno())
        sock0.close()
    
        sock.close()

    This code behaves differently on 3.5 vs 3.6.

    In uvloop (https://github.com/magicstack/uvloop) I create Python sockets for libuv socket handles. Sometimes, those handles are closed by libuv, and I have no control over that. So right now, a lot of uvloop tests fail because socket.close() now can raise an error.

    It also doesn't make any sense to raise it anyways. socket.close() *must* be safe to call even when the socket is already closed.

    @1st1 1st1 reopened this Oct 18, 2016
    @1st1
    Copy link
    Member

    1st1 commented Oct 18, 2016

    IOW, what is happening in uvloop:

    1. A user has a Python socket object and passes it asyncio (run with uvloop) API.

    2. uvloop extracts the FD of the socket, and passes it to low-level libuv.

    3. At some point of time, libuv closes the FD.

    4. When user tries to close the socket, it silently fails in 3.5, and raises an exception in 3.6.

    One way to solve this would be to use os.dup() in uvloop (i.e. pass a duplicated FD to libuv). But that would mean that programs that use uvloop will consume file descriptors (limited resource) faster than programs that use vanilla asyncio.

    Currently, there's a documented way of creating a socket out of an existing FD (fourth argument to the socket's constructor). This means that the FD might be controlled by someone. At least in this case, sock.close() must not raise if it fails. It's OK if the FD is already close, because the Python socket was only wrapping it in the first place.

    @vadmium
    Copy link
    Member Author

    vadmium commented Oct 18, 2016

    I think your code example is not very robust, because the “sock” object refers to a freed file descriptor, and could easily close an unrelated file:

    $ python3.5 -q
    >>> import socket
    >>> sock0 = socket.socket()
    >>> sock = socket.socket(fileno=sock0.fileno())
    >>> sock0.close()
    >>> f = open("/dev/null")  # Unrelated code/thread opens a file
    >>> sock.close()  # Closes unrelated file descriptor!
    >>> f.close()  # Error even in 3.5
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [Errno 9] Bad file descriptor

    I am not familiar with your use case or library, but I would suggest that either:

    1. Your code should call sock0.detach() rather than fileno(), so that sock0 no longer “owns” the file descriptor, or

    2. libuv should not close file descriptors that it doesn’t own.

    Calling socket.close() should be safe if the Python socket object is registered as closed, but IMO calling close(), or any other method, when the OS socket (file descriptor) has been released behind Python’s back is a programming error.

    @vadmium
    Copy link
    Member Author

    vadmium commented Oct 18, 2016

    If libuv closes the FD (step 3), won’t you get the same sort of problem if the uvloop user tries to do something else with the Python socket object, e.g. call getpeername()?

    I see the fileno=... parameter for sockets as a parallel to the os.fdopen() function, which does raise exceptions from FileIO.close().

    Maybe one option is to only trigger a DeprecationWarning, and raise a proper OSError in a future version.

    @1st1
    Copy link
    Member

    1st1 commented Oct 18, 2016

    Another example: some asyncio (run with uvloop) code:

        conn, _ = lsock.accept()
        f = loop.create_task(
            loop.connect_accepted_socket(
                proto_factory, conn))
        \# More code
        loop.run_forever()
        conn.close()
    

    Suppose the above snippet of code is some real-world program.

    Now, in Python 3.5 everything works as expected. In Python 3.6, "conn.close()" will raise an exception.

    Why: uvloop passes the FD of "conn" to libuv, which does its thing and closes the connection when it should be closed.

    Now:

    1. Your code should call sock0.detach() rather than fileno(), so that sock0 no longer “owns” the file descriptor, or

    I can't modify "connect_accepted_socket" to call "detach" on "conn", as it would make "conn" unusable right after the call. This option can't be implemented, as it would break the backwards compatibility.

    1. libuv should not close file descriptors that it doesn’t own.

    This is not just about uvloop/libuv. It's about any Python program out there that will break in 3.6. A lot of code extracts the FD out of socket objects and does something with it. We can't just make socket.close() to raise in 3.6 -- that's not how backwards compatibility promise works.

    Guido, what do you think about this issue?

    @1st1
    Copy link
    Member

    1st1 commented Oct 18, 2016

    My proposal: ignore EBADF in socket.close(). It means that the socket is closed already. It doesn't matter why.

    @vstinner
    Copy link
    Member

    My proposal: ignore EBADF in socket.close(). It means that the socket is closed already. It doesn't matter why.

    As Martin explained, getting EBAD means that your code smells. Your
    code may close completely unrelated file descriptor... Say hello to
    race conditions :-)

    @1st1
    Copy link
    Member

    1st1 commented Oct 19, 2016

    Maybe we should fix asyncio. Maybe if a user passes an existing socket object into loop.create_connection, it should duplicate the socket.

    @1st1
    Copy link
    Member

    1st1 commented Oct 19, 2016

    After thinking about this problem for a while, I arrived to the conclusion that we need to fix asyncio. Essentially, when a user passes a socket object to the event loop API like 'create_server', they give up the control of the socket to the loop. The loop should detach the socket object and have a full control over it.

    @1st1 1st1 closed this as completed Oct 19, 2016
    @vstinner
    Copy link
    Member

    "After thinking about this problem for a while, I arrived to the conclusion that we need to fix asyncio."

    Ah, you became reasonable :-D

    "Essentially, when a user passes a socket object to the event loop API like 'create_server', they give up the control of the socket to the loop. The loop should detach the socket object and have a full control over it."

    I don't know how asyncio should be modified exactly, but I agree that someone should change in asyncio. The question is more about how to reduce headache because of the backward compatibility and don't kill performances.

    Would you mind to open an issue specific for asyncio?

    @1st1
    Copy link
    Member

    1st1 commented Oct 19, 2016

    Ah, you became reasonable :-D

    Come on... :)

    Would you mind to open an issue specific for asyncio?

    I'll open an issue on GH today to discuss with you/Guido/asyncio devs.

    @1st1
    Copy link
    Member

    1st1 commented Oct 19, 2016

    > Would you mind to open an issue specific for asyncio?

    I'll open an issue on GH today to discuss with you/Guido/asyncio devs.

    Guido, Victor, please take a look: python/asyncio#449

    @vstinner
    Copy link
    Member

    vstinner commented Jul 4, 2017

    New changeset 67e1478 by Victor Stinner in branch 'master':
    bpo-30319: socket.close() now ignores ECONNRESET (bpo-2565)
    67e1478

    @vstinner
    Copy link
    Member

    vstinner commented Jul 4, 2017

    New changeset 580cd5c by Victor Stinner in branch '3.6':
    bpo-30319: socket.close() now ignores ECONNRESET (bpo-2565) (bpo-2566)
    580cd5c

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants