msg205118 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-03 14:11 |
I remember a discussion about EBADF, but I don't remember the conclusion. The documentation of the asyncio doesn't explain the behaviour of selectors when a file/socket is closed, without unregistering it from the selector.
I should be explicitly documented. I would accept an "undefined behaviour" if it's documented :-)
|
msg205123 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-03 15:45 |
Yeah, the behavior is at least different for each type of polling system calls, and possibly also for different platforms. It would be good to describe at least all the different possible behaviors.
|
msg205134 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-03 18:30 |
Well, unregister() documentation currently contains this:
"""
.. method:: unregister(fileobj)
Unregister a file object from selection, removing it from monitoring. A
file object shall be unregistered prior to being closed.
"""
I'm not sure what else to say (I don' like the idea of documenting
possible behaviors, because it's non-portable, and really might change
in a future version).
|
msg205135 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-03 18:35 |
I think we should give the reader some kind of hint, since a bug in this area may cause a lot of pain when it has to be debugged on porting from a system where the issue is silent to one where it causes a crash. These docs (unlike a PEP) are not a formal standard but documentation for users. Maybe we can add a separate section of caveats?
|
msg205136 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-03 18:41 |
"(I don' like the idea of documenting
possible behaviors, because it's non-portable, and really might change
in a future version)."
The description doesn't need to be precise, you can just say "depending on the platform, closing a file descriptor while selector.select() is polling may be ignored or raise an exception". Which can of exception is raised? It is possible to catch it and find the closed file descriptor?
|
msg205142 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-03 19:51 |
I think you're looking for the discussion in issue 19017.
IIRC the conclusion is that not only do you not get the same error everywhere, but you get it at different points -- sometimes register() of a bad FD passes and then [Selector.]select() fails, other times register() of a bad FD fails; when the FD is initially good and then gets closed, sometimes select() may fail, sometimes select() will silently ignore the FD. Sometimes unregister() of a closed FD will return False, sometimes True.
Another consequence is that registering an FD, then closing it, then calling select(), then reopening it may keep reporting events for the FD or not.
I think these are all things to call out in a section on caveats or common bugs.
|
msg205160 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-03 21:48 |
> Guido van Rossum added the comment:
>
> I think you're looking for the discussion in issue 19017.
>
> IIRC the conclusion is that not only do you not get the same error everywhere, but you get it at different points -- sometimes register() of a bad FD passes and then [Selector.]select() fails, other times register() of a bad FD fails; when the FD is initially good and then gets closed, sometimes select() may fail, sometimes select() will silently ignore the FD. Sometimes unregister() of a closed FD will return False, sometimes True.
>
> Another consequence is that registering an FD, then closing it, then calling select(), then reopening it may keep reporting events for the FD or not.
Exactly, it's a mess.
> I think these are all things to call out in a section on caveats or common bugs.
What I don't remember was the conclusion: do we want to keep the
current OS-specific behavior, or do we want to try to be tolerant with
misuse?
For example, one possibility would be to ignore errors when
unregistering a file descriptor from epoll: for example, the
selectmodule currently ignore EBADF when unregistering a FD:
"""
case EPOLL_CTL_DEL:
/* In kernel versions before 2.6.9, the EPOLL_CTL_DEL
* operation required a non-NULL pointer in event, even
* though this argument is ignored. */
Py_BEGIN_ALLOW_THREADS
result = epoll_ctl(epfd, op, fd, &ev);
if (errno == EBADF) {
/* fd already closed */
result = 0;
errno = 0;
}
"""
IIRC libev and libevent both ignore those errors.
We have to settle on a solution before documenting it.
|
msg205162 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-03 22:24 |
>>> import selectors, os
>>> r,w=os.pipe()
>>> s=selectors.SelectSelector()
>>> s.register(r, selectors.EVENT_READ)
SelectorKey(fileobj=3, fd=3, events=1, data=None)
>>> os.close(r)
>>> os.close(w)
>>> s.unregister(r)
SelectorKey(fileobj=3, fd=3, events=1, data=None)
SelectSelector.unregister(<closed fd>) doesn't raise any error, so it makes sense to ignore EBADF in EpollSelector.unregister(). What's the point of raising an error here?
If you want a portable behaviour, the file descriptor should be tested (ex: call os.fstat or os.dup). For the "normal" use case, I would not expect a syscall on unregister(), because unregister() may be called just before closing the file descriptor.
Maybe you can explain that in the "caveats" section? Suggestion: "unregister(fd) does not check if fd is closed or not (to get a portable behaviour), os.fstat() or os.dup() can be used to check if the fd is closed or not". Or "unregister(fd) ignores error if the file descriptor is closed".
|
msg205165 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-03 22:45 |
Heh, I'd forgotten the behavior of unregister(). It seems that there are two layers to the behavior -- if this FD was never register()ed it will raise; if it was register()ed but has since been close()d it may raise.
For the higher-level APIs in asyncio I chose not to raise from the remove_{reader,writer}() methods -- they return True if something was removed, False if not. This currently has to be implemented by explicitly asking the selector for the key first. I.e.:
def remove_reader(self, fd):
"""Remove a reader callback."""
try:
key = self._selector.get_key(fd)
except KeyError:
return False
else:
mask, (reader, writer) = key.events, key.data
mask &= ~selectors.EVENT_READ
if not mask:
self._selector.unregister(fd)
else:
self._selector.modify(fd, mask, (None, writer))
if reader is not None:
reader.cancel()
return True
else:
return False
|
msg205166 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-03 22:46 |
(What I meant to add was, I'd be happy if unregister() also just used a true/false return.)
|
msg205230 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-04 18:23 |
I just ran into a live case of the platform differences here. Check out http://bugs.python.org/review/19509/ (issue 19509). Christian uploaded a patch for asyncio, and when I tested it I got a double traceback and a hang. This could have been avoided if the unregister() call for the closed FD had been silent instead of raising.
I think that the proper fix might have been not to close the socket, but nevertheless this failure confused everyone -- the author of the patch thought it had to do with the SSL version, I was initially confused by the first half of the traceback (which turned out to be expected, this was in an assertRaises() call), and I spent half an hour in pdb to track down the real cause.
|
msg205233 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-04 19:09 |
The more I think about this the more I believe unregister() should catch the OSError (but not the KeyError).
Every unregister() implementation starts by calling super().unregister(key), which has a side effect (it removes the key from the _fd_to_key dict).
I believe that once this side effect has happened the unregister() call should return with success even if the kqueue syscall fails with OSError.
A further refinement could be to skip the kqueue syscall *if* the registered object is in fact an object with a fileno() method and not a bare FD, and the object is closed -- we should be able to tell that by calling its fileno() method, which should return -1 or None if it is closed. (But this would be mostly an optimization -- and a safety guard in case the FD has been reused for a different object.)
I don't know how poll and epoll behave under these circumstances, but given that only the Kqueue-based asyncio test failed I think those don't raise when the FD has been closed.
If you are not amenable to this fix I will have to catch the OSError in Tulip's remove_reader(), e.g. like this:
try:
if not mask:
self._selector.unregister(fd)
else:
self._selector.modify(fd, mask, (None, writer))
except OSError:
# unregister()/modify() may or may not raise this if
# the FD is closed -- it depends on what type of
# selector is used.
pass
(and repeated in remove_writer()).
|
msg205333 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-05 23:14 |
Here's a tentative change to selectors.py that ignores the OSError in various unregister() methods (but not in register()).
|
msg205382 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-06 16:01 |
Ping? Charle-François, what do you think of my patch to ignore OSError in unregister()?
|
msg205385 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-06 16:46 |
Sorry for the delay, I didn't have any free time this week.
I'll review the patch shortly, but the idea sounds fine (I just need
to check if we can't be a little more specific for errnos upon
unregister).
|
msg205406 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-06 21:40 |
Here's a new patch. Note that it includes a commented-out test that demonstrates the failure if the socket object itself is closed (rather than just its FD).
|
msg205408 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-06 22:00 |
Here's a variant that documents the ValueError issue and omits the commented-out test.
|
msg205409 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-06 22:15 |
Here's an attempt at fixing the ValueError.
I don't like the exhaustive search much, but the alternative is to maintain an inverse dict. What do you think?
|
msg205410 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-06 22:59 |
> Guido van Rossum added the comment:
>
> Here's an attempt at fixing the ValueError.
>
> I don't like the exhaustive search much, but the alternative is to maintain an inverse dict. What do you think?
I was going to suggest such an exhaustive search.
I think it's the cleanest/simplest solution, and the performance
overhead is IMO completely unimportant since it's not supposed to
happen often, and if the key isn't found we're going to raise an
exception anyway.
So if we want to handle this case (and I think we should to be
consistent), that's the best way to go.
But I think that OSError should still be caught in
EpollSelector.unregister(): since if the FD is closed before,
epoll.unregister() will raise ENOENT/EBADF since the FD will have
automatically been removed (exactly as for kqueue according to the man
page).
|
msg205411 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-06 23:18 |
OK, I'll make a new patch (maybe Monday). I want to be a little more careful with the exhaustive search, I think it should not be attempted if we see KeyError or AttributeError (those should not be dynamic).
I tested for the epoll error on Ubuntu and didn't get OSError, but I'm happy to keep it in if the man pays says so. (How sure are you about poll() not doing this?)
|
msg205432 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 01:51 |
So I think this is why epoll doesn't raise OSError:
http://hg.python.org/cpython/file/44dacafdd48a/Modules/selectmodule.c#l1335
The Python wrapper explicitly checks for EBADF and turns this into a non-error result.
|
msg205433 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 01:58 |
Well, I take it back. If you close the FD and then reuse it, you get ENOENT, which is not caught. So we still need the try/except OSError.
I am going to experiment with a PollSelector as well.
|
msg205434 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 02:00 |
PollSelector doesn't seem to have this behavior.
|
msg205438 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 03:37 |
New patch. Please review.
The error handling is a bit complicated but I like to be careful in which errors I catch.
|
msg205468 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 16:49 |
Sorry, here's another version. It keeps the original _fileobj_to_fd function and wraps it with a method that does the exhaustive search.
|
msg205478 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 19:01 |
I think I got the closing sorted out now, and through reordering the dup2() calls are actually needed.
|
msg205483 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 20:22 |
OK, here's another try. I ran what you suggested for all three tests I added and they are all clean. I realized that every single call to socketpair() is followed by two addCleanup calls, so I added a make_socketpair() helper method that does this.
|
msg205496 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-07 22:53 |
LGTM!
|
msg205499 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-12-07 23:57 |
New changeset 39e7995f9ad1 by Guido van Rossum in branch 'default':
Silently ignore unregistering closed files. Fixes issue 19876. With docs and slight test refactor.
http://hg.python.org/cpython/rev/39e7995f9ad1
|
msg205500 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-07 23:59 |
Is this worthy of a Misc/NEWS entry?
|
msg205501 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-12-08 00:03 |
New changeset f334dd2471e7 by Guido van Rossum in branch 'default':
News item for issue 19876.
http://hg.python.org/cpython/rev/f334dd2471e7
|
msg205502 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-08 00:04 |
Done.
|
msg205529 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-08 08:53 |
The test is failing on Windows buildbot:
http://buildbot.python.org/all/builders/x86%20Windows%20Server%202003%20%5BSB%5D%203.x/builds/1851/steps/test/logs/stdio
"""
======================================================================
ERROR: test_unregister_after_fd_close_and_reuse (test.test_selectors.DefaultSelectorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_selectors.py", line 122, in test_unregister_after_fd_close_and_reuse
os.dup2(rd2.fileno(), r)
OSError: [Errno 9] Bad file descriptor
======================================================================
ERROR: test_unregister_after_fd_close_and_reuse (test.test_selectors.SelectSelectorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_selectors.py", line 122, in test_unregister_after_fd_close_and_reuse
os.dup2(rd2.fileno(), r)
OSError: [Errno 9] Bad file descriptor
"""
Apparently, dup2() doesn't work on Windows because on Windows, sockets aren't file descriptors, but a different beast...
|
msg205546 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-08 11:22 |
I don't like generic "except OSError: pass". Here is a first patch for epoll() to use "except FileNotFoundError: pass" instead. Kqueue selector should also be patched.
I tested to close epoll FD (os.close(epoll.fileno())): on Linux 3.11, epoll.unregister(fd) and epoll.close() don't raise an error. Strange. (The C code looks correct).
(About the commit: I don't like "_fileobj_lookup" method name, we loose the information (compared to "_fileobj_to_fd" name) that the method returns a file dscriptor. I would prefer "_get_fd" or "_get_fileobj_fd".)
|
msg205551 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-08 12:11 |
> STINNER Victor added the comment:
>
> I don't like generic "except OSError: pass". Here is a first patch for epoll() to use "except FileNotFoundError: pass" instead. Kqueue selector should also be patched.
Except that it can fail with ENOENT, but also EBADF, and EPERM if the
FD has been reused by a FD which doesn't support epoll.
So if we want to go this way, we should at least catach ENOENT, EBADF
and EPERM. Same for kqueue: we should at least catch ENOENT and EBADF.
> I tested to close epoll FD (os.close(epoll.fileno())): on Linux 3.11, epoll.unregister(fd) and epoll.close() don't raise an error. Strange. (The C code looks correct).
unregister() ignores EBADF.
> (About the commit: I don't like "_fileobj_lookup" method name, we loose the information (compared to "_fileobj_to_fd" name) that the method returns a file dscriptor. I would prefer "_get_fd" or "_get_fileobj_fd".)
Well, Guido likes it, I like it, and this is really nit-picking
(especially since it's a private method).
|
msg205606 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-08 21:06 |
I don't think we should be more selective about the errno values, the try block is narrow enough (just one syscall) and we really don't know what the kernel will do on different platforms. And what would we do about it anyway?
I will look into the Windows problem, but I suspect the best we can do there is skip the test.
|
msg205607 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-12-08 21:20 |
> I will look into the Windows problem, but I suspect the best we can do there is skip the test.
I already took care of that:
http://hg.python.org/cpython/rev/01676a4c16ff
|
msg205608 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-08 21:21 |
Then here's a hopeful fix for the Windows situation that relies on the socketpair() operation reusing FDs from the lowest value. I'm adding asserts to check that this is actually the case. (These are actual assert statements to indicate that they are verifying an assumption internal to the test, not verifying the functionality under test.)
I'll test it when I next get near a Windows box (Monday in the office) -- or someone else with Windows access can let me know.
|
msg205631 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-12-09 00:57 |
New changeset c4c1c4bc8086 by Victor Stinner in branch 'default':
Issue #19876: Run also test_selectors.test_unregister_after_fd_close_and_reuse() on Windows
http://hg.python.org/cpython/rev/c4c1c4bc8086
|
msg205632 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-09 00:59 |
Oops, I reverted my changeset c4c1c4bc8086, I didn't read why the test was skipped on Windows. Sorry.
|
msg205633 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-09 01:01 |
"Except that it can fail with ENOENT, but also EBADF, and EPERM if the
FD has been reused by a FD which doesn't support epoll."
Oh, I didn't know that. I ran the unit test, and I expected the two unit test to cover any error case. So ignore my epoll_except.patch, the current "except OSError: pass" is correct.
|
msg205724 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-09 18:22 |
Sadly, the optimistic code doesn't work on Windows. I think it may be because the socketpair() helper at the top of test_selectors.py uses an extra socket ('l') and the handles just don't match up (I get a failure on assert wr2.fileno() == w). So I propose to stick with the current solution of skipping the test on Windows.
I'll close this bug in 24 hours unless I get a response sooner.
|
msg205725 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-09 18:24 |
The current test using os.dup2() with a skip on Windows is fine.
|
msg205726 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2013-12-09 18:28 |
OK, closed.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:54 | admin | set | github: 64075 |
2013-12-09 18:28:31 | gvanrossum | set | status: open -> closed resolution: fixed messages:
+ msg205726
|
2013-12-09 18:24:58 | vstinner | set | messages:
+ msg205725 |
2013-12-09 18:22:46 | gvanrossum | set | messages:
+ msg205724 |
2013-12-09 01:01:06 | vstinner | set | messages:
+ msg205633 |
2013-12-09 00:59:59 | vstinner | set | messages:
+ msg205632 |
2013-12-09 00:57:33 | python-dev | set | messages:
+ msg205631 |
2013-12-08 21:21:43 | gvanrossum | set | files:
+ nodup2.diff
messages:
+ msg205608 |
2013-12-08 21:20:00 | neologix | set | messages:
+ msg205607 |
2013-12-08 21:06:48 | gvanrossum | set | messages:
+ msg205606 |
2013-12-08 12:12:00 | neologix | set | messages:
+ msg205551 |
2013-12-08 11:22:14 | vstinner | set | files:
+ epoll_except.patch resolution: fixed -> (no value) messages:
+ msg205546
|
2013-12-08 08:53:32 | neologix | set | status: closed -> open
messages:
+ msg205529 |
2013-12-08 00:04:55 | gvanrossum | set | status: open -> closed type: behavior messages:
+ msg205502
assignee: neologix -> gvanrossum resolution: fixed stage: resolved |
2013-12-08 00:03:46 | python-dev | set | messages:
+ msg205501 |
2013-12-07 23:59:32 | gvanrossum | set | messages:
+ msg205500 |
2013-12-07 23:57:14 | python-dev | set | nosy:
+ python-dev messages:
+ msg205499
|
2013-12-07 22:53:07 | neologix | set | messages:
+ msg205496 |
2013-12-07 20:22:29 | gvanrossum | set | files:
+ unregister8.diff
messages:
+ msg205483 |
2013-12-07 19:01:23 | gvanrossum | set | files:
+ unregister7.diff
messages:
+ msg205478 |
2013-12-07 16:49:39 | gvanrossum | set | files:
+ unregister6.diff
messages:
+ msg205468 |
2013-12-07 03:37:32 | gvanrossum | set | files:
+ unregister5.diff
messages:
+ msg205438 |
2013-12-07 02:00:32 | gvanrossum | set | messages:
+ msg205434 |
2013-12-07 01:58:56 | gvanrossum | set | messages:
+ msg205433 |
2013-12-07 01:51:30 | gvanrossum | set | messages:
+ msg205432 |
2013-12-06 23:18:50 | gvanrossum | set | messages:
+ msg205411 |
2013-12-06 22:59:10 | neologix | set | messages:
+ msg205410 |
2013-12-06 22:15:16 | gvanrossum | set | files:
+ unregister4.diff
messages:
+ msg205409 |
2013-12-06 22:00:53 | gvanrossum | set | files:
+ unregister3.diff
messages:
+ msg205408 |
2013-12-06 21:40:32 | gvanrossum | set | files:
+ unregister2.diff
messages:
+ msg205406 |
2013-12-06 16:46:46 | neologix | set | messages:
+ msg205385 |
2013-12-06 16:01:24 | gvanrossum | set | assignee: docs@python -> neologix messages:
+ msg205382 |
2013-12-05 23:14:54 | gvanrossum | set | files:
+ unregister.diff keywords:
+ patch messages:
+ msg205333
|
2013-12-04 19:09:06 | gvanrossum | set | messages:
+ msg205233 |
2013-12-04 18:23:26 | gvanrossum | set | messages:
+ msg205230 |
2013-12-03 22:46:04 | gvanrossum | set | messages:
+ msg205166 |
2013-12-03 22:45:32 | gvanrossum | set | messages:
+ msg205165 |
2013-12-03 22:24:36 | vstinner | set | messages:
+ msg205162 |
2013-12-03 21:48:50 | neologix | set | messages:
+ msg205160 |
2013-12-03 19:51:45 | gvanrossum | set | messages:
+ msg205142 |
2013-12-03 18:41:34 | vstinner | set | messages:
+ msg205136 |
2013-12-03 18:35:20 | gvanrossum | set | messages:
+ msg205135 |
2013-12-03 18:30:04 | neologix | set | messages:
+ msg205134 |
2013-12-03 15:45:37 | gvanrossum | set | messages:
+ msg205123 |
2013-12-03 14:11:19 | vstinner | create | |