classification
Title: selectors (and asyncio?): document behaviour on closed files/sockets
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: docs@python, gvanrossum, haypo, neologix, python-dev
Priority: normal Keywords: patch

Created on 2013-12-03 14:11 by haypo, last changed 2013-12-09 18:28 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
unregister.diff gvanrossum, 2013-12-05 23:14 review
unregister2.diff gvanrossum, 2013-12-06 21:40 review
unregister3.diff gvanrossum, 2013-12-06 22:00 review
unregister4.diff gvanrossum, 2013-12-06 22:15 review
unregister5.diff gvanrossum, 2013-12-07 03:37 review
unregister6.diff gvanrossum, 2013-12-07 16:49 review
unregister7.diff gvanrossum, 2013-12-07 19:01 review
unregister8.diff gvanrossum, 2013-12-07 20:22 review
epoll_except.patch haypo, 2013-12-08 11:22 review
nodup2.diff gvanrossum, 2013-12-08 21:21 review
Messages (44)
msg205118 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-03 14:11
I remember a discussion about EBADF, but I don't remember the conclusion. The documentation of the asyncio doesn't explain the behaviour of selectors when a file/socket is closed, without unregistering it from the selector.

I should be explicitly documented. I would accept an "undefined behaviour" if it's documented :-)
msg205123 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-03 15:45
Yeah, the behavior is at least different for each type of polling system calls, and possibly also for different platforms.  It would be good to describe at least all the different possible behaviors.
msg205134 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-03 18:30
Well, unregister() documentation currently contains this:
"""
   .. method:: unregister(fileobj)

      Unregister a file object from selection, removing it from monitoring. A
      file object shall be unregistered prior to being closed.
"""

I'm not sure what else to say (I don' like the idea of documenting
possible behaviors, because it's non-portable, and really might change
in a future version).
msg205135 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-03 18:35
I think we should give the reader some kind of hint, since a bug in this area may cause a lot of pain when it has to be debugged on porting from a system where the issue is silent to one where it causes a crash.  These docs (unlike a PEP) are not a formal standard but documentation for users.  Maybe we can add a separate section of caveats?
msg205136 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-03 18:41
"(I don' like the idea of documenting
possible behaviors, because it's non-portable, and really might change
in a future version)."

The description doesn't need to be precise, you can just say "depending on the platform, closing a file descriptor while selector.select() is polling may be ignored or raise an exception". Which can of exception is raised? It is possible to catch it and find the closed file descriptor?
msg205142 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-03 19:51
I think you're looking for the discussion in issue 19017.

IIRC the conclusion is that not only do you not get the same error everywhere, but you get it at different points -- sometimes register() of a bad FD passes and then [Selector.]select() fails, other times register() of a bad FD fails; when the FD is initially good and then gets closed, sometimes select() may fail, sometimes select() will silently ignore the FD. Sometimes unregister() of a closed FD will return False, sometimes True.

Another consequence is that registering an FD, then closing it, then calling select(), then reopening it may keep reporting events for the FD or not.

I think these are all things to call out in a section on caveats or common bugs.
msg205160 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-03 21:48
> Guido van Rossum added the comment:
>
> I think you're looking for the discussion in issue 19017.
>
> IIRC the conclusion is that not only do you not get the same error everywhere, but you get it at different points -- sometimes register() of a bad FD passes and then [Selector.]select() fails, other times register() of a bad FD fails; when the FD is initially good and then gets closed, sometimes select() may fail, sometimes select() will silently ignore the FD. Sometimes unregister() of a closed FD will return False, sometimes True.
>
> Another consequence is that registering an FD, then closing it, then calling select(), then reopening it may keep reporting events for the FD or not.

Exactly, it's a mess.

> I think these are all things to call out in a section on caveats or common bugs.

What I don't remember was the conclusion: do we want to keep the
current OS-specific behavior, or do we want to try to be tolerant with
misuse?
For example, one possibility would be to ignore errors when
unregistering a file descriptor from epoll: for example, the
selectmodule currently ignore EBADF when unregistering a FD:
"""
        case EPOLL_CTL_DEL:
        /* In kernel versions before 2.6.9, the EPOLL_CTL_DEL
         * operation required a non-NULL pointer in event, even
         * though this argument is ignored. */
        Py_BEGIN_ALLOW_THREADS
        result = epoll_ctl(epfd, op, fd, &ev);
        if (errno == EBADF) {
            /* fd already closed */
            result = 0;
            errno = 0;
        }
"""

IIRC libev and libevent both ignore those errors.

We have to settle on a solution before documenting it.
msg205162 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-03 22:24
>>> import selectors, os
>>> r,w=os.pipe()
>>> s=selectors.SelectSelector()
>>> s.register(r, selectors.EVENT_READ)
SelectorKey(fileobj=3, fd=3, events=1, data=None)
>>> os.close(r)
>>> os.close(w)
>>> s.unregister(r)
SelectorKey(fileobj=3, fd=3, events=1, data=None)

SelectSelector.unregister(<closed fd>) doesn't raise any error, so it makes sense to ignore EBADF in EpollSelector.unregister(). What's the point of raising an error here?

If you want a portable behaviour, the file descriptor should be tested (ex: call os.fstat or os.dup). For the "normal" use case, I would not expect a syscall on unregister(), because unregister() may be called just before closing the file descriptor.

Maybe you can explain that in the "caveats" section? Suggestion: "unregister(fd) does not check if fd is closed or not (to get a portable behaviour), os.fstat() or os.dup() can be used to check if the fd is closed or not". Or "unregister(fd) ignores error if the file descriptor is closed".
msg205165 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-03 22:45
Heh, I'd forgotten the behavior of unregister().  It seems that there are two layers to the behavior -- if this FD was never register()ed it will raise; if it was register()ed but has since been close()d it may raise.

For the higher-level APIs in asyncio I chose not to raise from the remove_{reader,writer}() methods -- they return True if something was removed, False if not.  This currently has to be implemented by explicitly asking the selector for the key first.  I.e.:

    def remove_reader(self, fd):
        """Remove a reader callback."""
        try:
            key = self._selector.get_key(fd)
        except KeyError:
            return False
        else:
            mask, (reader, writer) = key.events, key.data
            mask &= ~selectors.EVENT_READ
            if not mask:
                self._selector.unregister(fd)
            else:
                self._selector.modify(fd, mask, (None, writer))

            if reader is not None:
                reader.cancel()
                return True
            else:
                return False
msg205166 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-03 22:46
(What I meant to add was, I'd be happy if unregister() also just used a true/false return.)
msg205230 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-04 18:23
I just ran into a live case of the platform differences here.  Check out http://bugs.python.org/review/19509/ (issue 19509).  Christian uploaded a patch for asyncio, and when I tested it I got a double traceback and a hang.  This could have been avoided if the unregister() call for the closed FD had been silent instead of raising.

I think that the proper fix might have been not to close the socket, but nevertheless this failure confused everyone -- the author of the patch thought it had to do with the SSL version, I was initially confused by the first half of the traceback (which turned out to be expected, this was in an assertRaises() call), and I spent half an hour in pdb to track down the real cause.
msg205233 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-04 19:09
The more I think about this the more I believe unregister() should catch the OSError (but not the KeyError).

Every unregister() implementation starts by calling super().unregister(key), which has a side effect (it removes the key from the _fd_to_key dict).

I believe that once this side effect has happened the unregister() call should return with success even if the kqueue syscall fails with OSError.

A further refinement could be to skip the kqueue syscall *if* the registered object is in fact an object with a fileno() method and not a bare FD, and the object is closed -- we should be able to tell that by calling its fileno() method, which should return -1 or None if it is closed.  (But this would be mostly an optimization -- and a safety guard in case the FD has been reused for a different object.)

I don't know how poll and epoll behave under these circumstances, but given that only the Kqueue-based asyncio test failed I think those don't raise when the FD has been closed.

If you are not amenable to this fix I will have to catch the OSError in Tulip's remove_reader(), e.g. like this:

            try:
                if not mask:
                    self._selector.unregister(fd)
                else:
                    self._selector.modify(fd, mask, (None, writer))
            except OSError:
                # unregister()/modify() may or may not raise this if
                # the FD is closed -- it depends on what type of
                # selector is used.
                pass

(and repeated in remove_writer()).
msg205333 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-05 23:14
Here's a tentative change to selectors.py that ignores the OSError in various unregister() methods (but not in register()).
msg205382 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-06 16:01
Ping? Charle-François, what do you think of my patch to ignore OSError in unregister()?
msg205385 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-06 16:46
Sorry for the delay, I didn't have any free time this week.
I'll review the patch shortly, but the idea sounds fine (I just need
to check if we can't be a little more specific for errnos upon
unregister).
msg205406 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-06 21:40
Here's a new patch.  Note that it includes a commented-out test that demonstrates the failure if the socket object itself is closed (rather than just its FD).
msg205408 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-06 22:00
Here's a variant that documents the ValueError issue and omits the commented-out test.
msg205409 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-06 22:15
Here's an attempt at fixing the ValueError.

I don't like the exhaustive search much, but the alternative is to maintain an inverse dict.  What do you think?
msg205410 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-06 22:59
> Guido van Rossum added the comment:
>
> Here's an attempt at fixing the ValueError.
>
> I don't like the exhaustive search much, but the alternative is to maintain an inverse dict.  What do you think?

I was going to suggest such an exhaustive search.
I think it's the cleanest/simplest solution, and the performance
overhead is IMO completely unimportant since it's not supposed to
happen often, and if the key isn't found we're going to raise an
exception anyway.

So if we want to handle this case (and I think we should to be
consistent), that's the best way to go.

But I think that OSError should still be caught in
EpollSelector.unregister(): since if the FD is closed before,
epoll.unregister() will raise ENOENT/EBADF since the FD will have
automatically been removed (exactly as for kqueue according to the man
page).
msg205411 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-06 23:18
OK, I'll make a new patch (maybe Monday).  I want to be a little more careful with the exhaustive search, I think it should not be attempted if we see KeyError or AttributeError (those should not be dynamic).

I tested for the epoll error on Ubuntu and didn't get OSError, but I'm happy to keep it in if the man pays says so.  (How sure are you about poll() not doing this?)
msg205432 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 01:51
So I think this is why epoll doesn't raise OSError:

http://hg.python.org/cpython/file/44dacafdd48a/Modules/selectmodule.c#l1335

The Python wrapper explicitly checks for EBADF and turns this into a non-error result.
msg205433 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 01:58
Well, I take it back. If you close the FD and then reuse it, you get ENOENT, which is not caught.  So we still need the try/except OSError.

I am going to experiment with a PollSelector as well.
msg205434 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 02:00
PollSelector doesn't seem to have this behavior.
msg205438 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 03:37
New patch. Please review.

The error handling is a bit complicated but I like to be careful in which errors I catch.
msg205468 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 16:49
Sorry, here's another version. It keeps the original _fileobj_to_fd function and wraps it with a method that does the exhaustive search.
msg205478 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 19:01
I think I got the closing sorted out now, and through reordering the dup2() calls are actually needed.
msg205483 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 20:22
OK, here's another try. I ran what you suggested for all three tests I added and they are all clean.  I realized that every single call to socketpair() is followed by two addCleanup calls, so I added a make_socketpair() helper method that does this.
msg205496 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-07 22:53
LGTM!
msg205499 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-07 23:57
New changeset 39e7995f9ad1 by Guido van Rossum in branch 'default':
Silently ignore unregistering closed files. Fixes issue 19876. With docs and slight test refactor.
http://hg.python.org/cpython/rev/39e7995f9ad1
msg205500 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-07 23:59
Is this worthy of a Misc/NEWS entry?
msg205501 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-08 00:03
New changeset f334dd2471e7 by Guido van Rossum in branch 'default':
News item for issue 19876.
http://hg.python.org/cpython/rev/f334dd2471e7
msg205502 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-08 00:04
Done.
msg205529 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-08 08:53
The test is failing on Windows buildbot:
http://buildbot.python.org/all/builders/x86%20Windows%20Server%202003%20%5BSB%5D%203.x/builds/1851/steps/test/logs/stdio
"""
======================================================================
ERROR: test_unregister_after_fd_close_and_reuse (test.test_selectors.DefaultSelectorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_selectors.py", line 122, in test_unregister_after_fd_close_and_reuse
    os.dup2(rd2.fileno(), r)
OSError: [Errno 9] Bad file descriptor

======================================================================
ERROR: test_unregister_after_fd_close_and_reuse (test.test_selectors.SelectSelectorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_selectors.py", line 122, in test_unregister_after_fd_close_and_reuse
    os.dup2(rd2.fileno(), r)
OSError: [Errno 9] Bad file descriptor
"""

Apparently, dup2() doesn't work on Windows because on Windows, sockets aren't file descriptors, but a different beast...
msg205546 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-08 11:22
I don't like generic "except OSError: pass". Here is a first patch for epoll() to use "except FileNotFoundError: pass" instead. Kqueue selector should also be patched.

I tested to close epoll FD (os.close(epoll.fileno())): on Linux 3.11, epoll.unregister(fd) and epoll.close() don't raise an error. Strange. (The C code looks correct).

(About the commit: I don't like "_fileobj_lookup" method name, we loose the information (compared to "_fileobj_to_fd" name) that the method returns a file dscriptor. I would prefer "_get_fd" or "_get_fileobj_fd".)
msg205551 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-08 12:11
> STINNER Victor added the comment:
>
> I don't like generic "except OSError: pass". Here is a first patch for epoll() to use "except FileNotFoundError: pass" instead. Kqueue selector should also be patched.

Except that it can fail with ENOENT, but also EBADF, and EPERM if the
FD has been reused by a FD which doesn't support epoll.
So if we want to go this way, we should at least catach ENOENT, EBADF
and EPERM. Same for kqueue: we should at least catch ENOENT and EBADF.

> I tested to close epoll FD (os.close(epoll.fileno())): on Linux 3.11, epoll.unregister(fd) and epoll.close() don't raise an error. Strange. (The C code looks correct).

unregister() ignores EBADF.

> (About the commit: I don't like "_fileobj_lookup" method name, we loose the information (compared to "_fileobj_to_fd" name) that the method returns a file dscriptor. I would prefer "_get_fd" or "_get_fileobj_fd".)

Well, Guido likes it, I like it, and this is really nit-picking
(especially since it's a private method).
msg205606 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-08 21:06
I don't think we should be more selective about the errno values, the try block is narrow enough (just one syscall) and we really don't know what the kernel will do on different platforms.  And what would we do about it anyway?

I will look into the Windows problem, but I suspect the best we can do there is skip the test.
msg205607 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-08 21:20
> I will look into the Windows problem, but I suspect the best we can do there is skip the test.

I already took care of that:
http://hg.python.org/cpython/rev/01676a4c16ff
msg205608 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-08 21:21
Then here's a hopeful fix for the Windows situation that relies on the socketpair() operation reusing FDs from the lowest value. I'm adding asserts to check that this is actually the case. (These are actual assert statements to indicate that they are verifying an assumption internal to the test, not verifying the functionality under test.)

I'll test it when I next get near a Windows box (Monday in the office) -- or someone else with Windows access can let me know.
msg205631 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-09 00:57
New changeset c4c1c4bc8086 by Victor Stinner in branch 'default':
Issue #19876: Run also test_selectors.test_unregister_after_fd_close_and_reuse() on Windows
http://hg.python.org/cpython/rev/c4c1c4bc8086
msg205632 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-09 00:59
Oops, I reverted my changeset c4c1c4bc8086, I didn't read why the test was skipped on Windows. Sorry.
msg205633 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-09 01:01
"Except that it can fail with ENOENT, but also EBADF, and EPERM if the
FD has been reused by a FD which doesn't support epoll."

Oh, I didn't know that. I ran the unit test, and I expected the two unit test to cover any error case. So ignore my epoll_except.patch, the current "except OSError: pass" is correct.
msg205724 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-09 18:22
Sadly, the optimistic code doesn't work on Windows.  I think it may be because the socketpair() helper at the top of test_selectors.py uses an extra socket ('l') and the handles just don't match up (I get a failure on assert wr2.fileno() == w).  So I propose to stick with the current solution of skipping the test on Windows.

I'll close this bug in 24 hours unless I get a response sooner.
msg205725 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-09 18:24
The current test using os.dup2() with a skip on Windows is fine.
msg205726 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-09 18:28
OK, closed.
History
Date User Action Args
2013-12-09 18:28:31gvanrossumsetstatus: open -> closed
resolution: fixed
messages: + msg205726
2013-12-09 18:24:58hayposetmessages: + msg205725
2013-12-09 18:22:46gvanrossumsetmessages: + msg205724
2013-12-09 01:01:06hayposetmessages: + msg205633
2013-12-09 00:59:59hayposetmessages: + msg205632
2013-12-09 00:57:33python-devsetmessages: + msg205631
2013-12-08 21:21:43gvanrossumsetfiles: + nodup2.diff

messages: + msg205608
2013-12-08 21:20:00neologixsetmessages: + msg205607
2013-12-08 21:06:48gvanrossumsetmessages: + msg205606
2013-12-08 12:12:00neologixsetmessages: + msg205551
2013-12-08 11:22:14hayposetfiles: + epoll_except.patch
resolution: fixed -> (no value)
messages: + msg205546
2013-12-08 08:53:32neologixsetstatus: closed -> open

messages: + msg205529
2013-12-08 00:04:55gvanrossumsetstatus: open -> closed
type: behavior
messages: + msg205502

assignee: neologix -> gvanrossum
resolution: fixed
stage: resolved
2013-12-08 00:03:46python-devsetmessages: + msg205501
2013-12-07 23:59:32gvanrossumsetmessages: + msg205500
2013-12-07 23:57:14python-devsetnosy: + python-dev
messages: + msg205499
2013-12-07 22:53:07neologixsetmessages: + msg205496
2013-12-07 20:22:29gvanrossumsetfiles: + unregister8.diff

messages: + msg205483
2013-12-07 19:01:23gvanrossumsetfiles: + unregister7.diff

messages: + msg205478
2013-12-07 16:49:39gvanrossumsetfiles: + unregister6.diff

messages: + msg205468
2013-12-07 03:37:32gvanrossumsetfiles: + unregister5.diff

messages: + msg205438
2013-12-07 02:00:32gvanrossumsetmessages: + msg205434
2013-12-07 01:58:56gvanrossumsetmessages: + msg205433
2013-12-07 01:51:30gvanrossumsetmessages: + msg205432
2013-12-06 23:18:50gvanrossumsetmessages: + msg205411
2013-12-06 22:59:10neologixsetmessages: + msg205410
2013-12-06 22:15:16gvanrossumsetfiles: + unregister4.diff

messages: + msg205409
2013-12-06 22:00:53gvanrossumsetfiles: + unregister3.diff

messages: + msg205408
2013-12-06 21:40:32gvanrossumsetfiles: + unregister2.diff

messages: + msg205406
2013-12-06 16:46:46neologixsetmessages: + msg205385
2013-12-06 16:01:24gvanrossumsetassignee: docs@python -> neologix
messages: + msg205382
2013-12-05 23:14:54gvanrossumsetfiles: + unregister.diff
keywords: + patch
messages: + msg205333
2013-12-04 19:09:06gvanrossumsetmessages: + msg205233
2013-12-04 18:23:26gvanrossumsetmessages: + msg205230
2013-12-03 22:46:04gvanrossumsetmessages: + msg205166
2013-12-03 22:45:32gvanrossumsetmessages: + msg205165
2013-12-03 22:24:36hayposetmessages: + msg205162
2013-12-03 21:48:50neologixsetmessages: + msg205160
2013-12-03 19:51:45gvanrossumsetmessages: + msg205142
2013-12-03 18:41:34hayposetmessages: + msg205136
2013-12-03 18:35:20gvanrossumsetmessages: + msg205135
2013-12-03 18:30:04neologixsetmessages: + msg205134
2013-12-03 15:45:37gvanrossumsetmessages: + msg205123
2013-12-03 14:11:19haypocreate