add a Selector to the select module #61057

neologix · 2013-01-03T17:38:06Z

BPO	16853
Nosy	@gvanrossum, @jcea, @pitrou, @vstinner, @giampaolo, @tiran, @meadori, @felipecruz
Files	selector_multiprocessing.diff selector_telnetlib.diff selector_bench.py tulip-selectors-1.diff: Tulip patches selector-8.diff tulip-selectors-2.diff tulip-selectors-3.diff: Fixed tulip patch tulip-selectors-3.diff return_key_fix.diff selector-9.diff selector-11.diff tulip_selector_stdlib.diff selector-12.diff selectors-14.diff selectors-15.diff

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2013-09-04.18:47:05.879>
created_at = <Date 2013-01-03.17:38:06.461>
labels = ['type-feature', 'library']
title = 'add a Selector to the select module'
updated_at = <Date 2014-03-17.19:47:21.180>
user = 'https://bugs.python.org/neologix'

bugs.python.org fields:

activity = <Date 2014-03-17.19:47:21.180>
actor = 'jcea'
assignee = 'none'
closed = True
closed_date = <Date 2013-09-04.18:47:05.879>
closer = 'pitrou'
components = ['Library (Lib)']
creation = <Date 2013-01-03.17:38:06.461>
creator = 'neologix'
dependencies = []
files = ['28547', '28548', '28557', '28604', '28621', '28622', '28636', '28637', '28638', '31116', '31117', '31118', '31133', '31499', '31536']
hgrepos = ['167']
issue_num = 16853
keywords = ['patch', 'needs review']
message_count = 106.0
messages = ['178970', '178973', '178974', '178979', '178980', '178981', '178992', '179001', '179018', '179042', '179046', '179066', '179070', '179074', '179075', '179078', '179082', '179120', '179122', '179126', '179127', '179129', '179135', '179142', '179144', '179156', '179161', '179231', '179240', '179241', '179242', '179252', '179255', '179275', '179299', '179300', '179328', '179365', '179381', '179382', '179383', '179384', '179385', '179387', '179389', '179392', '179423', '179481', '179486', '179489', '179490', '179491', '186286', '186617', '194179', '194182', '194184', '194190', '194218', '194236', '194243', '194590', '194592', '194593', '195722', '195724', '195726', '195749', '195760', '195801', '196075', '196092', '196098', '196099', '196240', '196403', '196527', '196536', '196537', '196575', '196596', '196620', '196625', '196632', '196633', '196634', '196635', '196641', '196649', '196651', '196654', '196926', '196930', '196931', '196932', '196933', '196934', '196935', '196938', '196939', '196940', '196941', '196957', '196984', '196996', '197010']
nosy_count = 12.0
nosy_names = ['gvanrossum', 'jcea', 'pitrou', 'vstinner', 'giampaolo.rodola', 'christian.heimes', 'meador.inge', 'neologix', 'rosslagerwall', 'python-dev', 'sbt', 'felipecruz']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue16853'
versions = ['Python 3.4']

neologix · 2013-01-03T17:38:05Z

Recently, the multiprocessing and telnetlib modules have been patched to use poll() instead of select() when available (mainly for the FD_SETSIZE limitation):
http://bugs.python.org/issue10527
http://bugs.python.org/issue14635

This leads to code duplication (just have a look at the commits to convince yourself), and also it highlights the fact that the select module is too low-level: right now, there's no easy way to do portable and efficient I/O multiplexing. What's more, /dev/poll and epoll() support have been added recently, which can make one puzzled by the myriad of available options.

Therefore, to reduce code duplication, and make writing portable and efficient code easier, I suggest the addition of a new select.Selector() class, which offers a - hopefully - simple and consistent API.
It has the following advantages:

portable
consistent (select() accepts and returns file objects, whereas pol() only returns FD, the timeouts don't have the same units - seconds/milliseconds, etc).

You'll find a tentative implementation attached, with tests (but no docs).

I'm also attaching two patches, one for multiprocessing.connection and the other for telnetlib, to show the before/after.

I'd like to have your feedback, especially on those points:

right now, Selector.select() returns a set of (fd, event mask). What do you think, would a bare list be OK?
the current version doesn't retry on EINTR, and I'd like to catch it by default: what do you think
since I use an abstract base class, I don't know how to have docstrings show up in the help

felipecruz · 2013-01-03T18:44:45Z

I think you have a point. Did you know about the tulip project?

http://code.google.com/p/tulip/source/browse/tulip/unix_events.py#76

It has a PollsterBase class and a SelectPollster(PollsterBase) so the idea is to have a Poller(and you call poll()) but select can be used underneath. Since tulip will be merged to stdlib, maybe you can work on tulip itself.

Current tulip Pollers don't return (fd, event) but I have a fork that does for Poll, EPoll, KQueue and WindowsPollster: https://bitbucket.org/felipecruz/tulip/

The selector class doesn't have this change (return fd, mask tuple) right now.

pitrou · 2013-01-03T19:18:28Z

tulip is much bigger than simply a select / poll wrapper. It would probably make more sense for tulip to reuse the abstraction which is proposed here.

neologix · 2013-01-03T19:53:04Z

I think you have a point. Did you know about the tulip project?

http://code.google.com/p/tulip/source/browse/tulip/unix_events.py#76

It has a PollsterBase class and a SelectPollster(PollsterBase) so the idea is to have a Poller(and you call poll()) but select can be used underneath. Since tulip will be merged to stdlib, maybe you can work on tulip itself.

I've had a look, and indeed it has its own Pollster wrapper.
Since it can be useful outside of Tulip, it would probably make sense
to make it part of the expose API.

Current tulip Pollers don't return (fd, event) but I have a fork that does for Poll, EPoll, KQueue and WindowsPollster

Indeed, I think it would be useful.

I'm not sure what's best, accept events masks or have distinct methods
to register read and write events.

neologix · 2013-01-03T19:57:16Z

Oh, I noticed that EpollPollster never closes the underlying epoll FD:
that's why Pollster objects should have a close() method (and probably
support a context manager).

felipecruz · 2013-01-03T19:57:23Z

Hi Antonie,

What you said also makes sense to me.

There is one problem(?) that _map_events() is called for every event(for Poll and EPoll) and this can make things slower (I didn't tested it). Also, does it needs to be thread-safe?

neologix · 2013-01-03T22:26:45Z

Here's a new version closer to Tulip's one.

gvanrossum · 2013-01-04T01:37:05Z

I think this is a great idea. But let's wait until Tulip is a bit further along; the design of its Pollster might still change as PEP-3156 gets more review and feedback. According to PEP-429 the first beta of 3.4 won't go out until November this year.

neologix · 2013-01-04T13:18:09Z

Here's a new version adressing Guido's comments (except for kqueue, for which I'll add support later when I can test it).

I'm also attaching a benchmark to compare the implementations: as noted by Guido, the complexity of select/poll/epoll are the theoretical ones: in real life, the syscall's cost is probably dwarfed by objects creation, etc.

Here's a run, for two ready FDs among 1018:
"""
$ ./python ~/selector_bench.py -r 2 -m 1018 -t socket
Trying with 2 ready FDs out of 1018, type socket
<class 'select.EpollSelector'>
0.056010190999586484
<class 'select.PollSelector'>
0.2639519829990604
<class 'select.SelectSelector'>
1.1859817369986558
"""

So this can be interesting when a large number of FDs are monitored.

For sake of cmpleteness, for a sparse allocation (i.e. just a couple FDS allocated near FD_SETSIZE), there's not much gain:
"""
$ ./python ~/selector_bench.py -r 2 -m 1018 -t socket -s
Trying with 2 FDs starting at 1018, type socket
<class 'select.EpollSelector'>
0.06651040699944133
<class 'select.PollSelector'>
0.06033727799876942
<class 'select.SelectSelector'>
0.0948788189998595
"""

Two points I'm not sure about:

should EINTR be handled (i.e. retry, with an updated timeout). I'm tempted to say yes, because EINTR is just a pain the user should never be exposed with.
what should be done with POLLNVAL and POLLERR? Raise an exception (that's what Java does, but since you can get quite quite easily it would be a pain to use), return a generic SELECT_ERR event? FWIW, twisted returns POLLERR|POLLNVAL as a CONNECTION_LOST event.

gvanrossum · 2013-01-04T16:48:11Z

Would it make sense to explore this in the Tulip project first? It could be a new module, tulip/selector.py. (Heck, I'll even give you commit privileges in Tulip.)

Also, I've heard (but don't know from personal experience) that Jython supports select but it takes only lists of socket objects, not file descriptors (which don't exist in Java).

Finally, what about Windows?

pitrou · 2013-01-04T17:06:20Z

I've posted a review on Rietveld. Not sure the notification e-mail was sent since I got a weird response from the server.

should EINTR be handled (i.e. retry, with an updated timeout). I'm tempted to say
yes, because EINTR is just a pain the user should never be exposed with.

You could add an optional argument to the select() method?

what should be done with POLLNVAL and POLLERR? Raise an exception (that's what Java
does, but since you can get quite quite easily it would be a pain to use), return a
generic SELECT_ERR event?

I would say return SELECT_ERR.

So this can be interesting when a large number of FDs are monitored.

The difference is impressive indeed.

giampaolo · 2013-01-04T18:44:06Z

Is this currently being written from scratch? It shouldn't be IMO.
Instead I recommend using the existent Tornado or pyftpdlib IO loop as an example:

https://github.com/facebook/tornado/tree/master/tornado/platform
https://github.com/facebook/tornado/blob/master/tornado/ioloop.py#L544

http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/ioloop.py?spec=svn1144&r=1135#266

...and modify the central polling method so that it returns a list of (fd, events) instead of calling handler methods.
Everything (except Windows support) is already in place and they have been tested extensively.

Exploring this into Tulip would be important IMO, as it would help figuring out whether the API is good enough to support what this component aims to achieve: being a generic poller upon which to build async IO loops.

gvanrossum · 2013-01-04T19:11:39Z

Giampaolo, there are many details in the Tornado and pyftpdlib I/O loops that may or may not be intentional. I wrote the Tulip pollster from scratch but am looking to merge in relevant knowledge and features from Tornado, Twisted and pyftpdlib. AFAIK the code here is mostly adapted from Tulip. Please suggest specific features from Tornado or pyftpdlib that are currently missing. However I don't think we should just copy over the existing 3rd party code (if only because we're trying to write pure Python 3.3 code here, without historical baggage).

rosslagerwall · 2013-01-04T19:53:08Z

Interesting benchmark. There is no gain for epoll with a large number of
ready fds (as expected) but at least it is no worse than poll. Poll offers
a large improvement on select, in this case.

$ ./python selector_bench.py -r 2 -m 1000 -t pipe
Trying with 2 ready FDs out of 1000, type pipe
<class 'select.EpollSelector'>
0.004317363000154728
<class 'select.PollSelector'>
0.08897221900042496
<class 'select.SelectSelector'>
0.1558117479999055

$ ./python selector_bench.py -r 100 -m 1000 -t pipe
Trying with 100 ready FDs out of 1000, type pipe
<class 'select.EpollSelector'>
0.11067074099992169
<class 'select.PollSelector'>
0.18777027299984184
<class 'select.SelectSelector'>
0.30157805000089866

$ ./python selector_bench.py -r 1000 -m 1000 -t pipe
Trying with 1000 ready FDs out of 1000, type pipe
<class 'select.EpollSelector'>
1.08963231099915
<class 'select.PollSelector'>
1.111169247998987
<class 'select.SelectSelector'>
8.283167362000313

neologix · 2013-01-04T19:57:42Z

Indeed, tornado and pyftpdlib implementations are much more complex,
have somewhat different APIs (i.e. only support FDs), have special
cases for older python versions, lack some newer features (e.g.
context manager).
The current code is much close to Tulip's.

Exploring this into Tulip would be important IMO, as it would help figuring out whether the API is good enough to support what this component aims to achieve: being a generic poller upon which to build async IO loops.

That's indeed a good idea, there's no need to rush this in.

neologix · 2013-01-04T20:34:17Z

Ross, the select() result for a large number of ready FDs was completely skewed because of a bug spotted by Antoine (complexity was much worse than it ought to be).
Here are the results with the fix:
"""
$ ./python ~/selector_bench.py -r 10 -m 1000 -t socket
Trying with 10 ready FDs out of 1000, type socket
<class 'select.EpollSelector'>
0.05156186099338811
<class 'select.PollSelector'>
0.23772043800272513
<class 'select.SelectSelector'>
0.5181516080047004
$ ./python ~/selector_bench.py -r 100 -m 1000 -t socket
Trying with 100 ready FDs out of 1000, type socket
<class 'select.EpollSelector'>
0.47576940699946135
<class 'select.PollSelector'>
0.6458770600002026
<class 'select.SelectSelector'>
0.828609222000523
$ ./python ~/selector_bench.py -r 1000 -m 1000 -t socket
Trying with 1000 ready FDs out of 1000, type socket
<class 'select.EpollSelector'>
4.970445963997918
<class 'select.PollSelector'>
5.7709292660001665
<class 'select.SelectSelector'>
4.030775418999838
"""

With a large number of FDs, many of which are ready, select can be faster.
Here's the output of "strace -c -e select,poll,epoll_wait":
"""

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
83.86 0.000421 0 1024 epoll_wait
16.14 0.000081 0 1024 poll
0.00 0.000000 0 1024 select
------ ----------- ----------- --------- --------- ----------------
100.00 0.000502 3072 total
"""

The systime is greater in epoll, but the systime is negligible, as show by 'time':
"""

real 0m14.856s
user 0m14.289s
sys 0m0.340s
"""

The time is really spent in the interpreter.
I'll dig some more (even though I doubt having 1000/1000 ready FDs is a common use case).

Note that it doesn't change anything when a small number of FDs are ready:
"""

$ ./python ~/selector_bench.py -r 10 -m 1000 -t socket
Trying with 10 ready FDs out of 1000, type socket
<class 'select.EpollSelector'>
0.05238822099636309
<class 'select.PollSelector'>
0.25595822899776977
<class 'select.SelectSelector'>
0.5156362060006359
"""

rosslagerwall · 2013-01-04T21:35:01Z

Ross, the select() result for a large number of ready FDs was
completely skewed because of a bug spotted by Antoine (complexity
was much worse than it ought to be).

Ah, that makes a little more sense now.

neologix · 2013-01-05T12:40:09Z

Here's a new version, with the following changes:

there's no SELECT_ERR anymore: error conditions (POLLNVAL|POLLERR and also POLLHUP) are reported as SELECT_IN|SELECT_OUT, depending on the input event mask: I don't think that a separate error flag is needed, because the error will be reported when the FD will be read/written (and in the context where the error can be handled appropriately, which wouldn't be the case if an exception was thrown by the Selector). That's what Java and libevent do, for example. libevent reports IN|OUT inconditionally in case of POLLER|POLLHUP, but I think that the output events should be a subset ot input events.
the exceptions set isn't passed to select() anymore: exception events (like OOB) are already reported in the read or write sets (checked with libevent and in the Linux kernel source)
most importantly, there was something missing/annoying in the API: now, it only accepts only file descriptors, and also an optional opaque object: the object - called "attached data" - can be used for example to pass a callback (that can be invoked by a higher-level reactor), or for example context-specific data: it could be a kind of a thread state to make it easy to do a context-switch, or simply a high-level object like a file, a socket or a telnet instance. This is actually what Java does, but they take it further into encapsulating the FD, interested events, attached data and ready events into a SelectionKey, which is passed and returned to/from the Selector: http://docs.oracle.com/javase/7/docs/api/java/nio/channels/SelectionKey.html
I've added some tests

giampaolo · 2013-01-05T14:16:40Z

@guido: agreed.

there's no SELECT_ERR anymore [...] the error will be reported when the FD will be read/written

I don't think this is a good idea, particularly during this early stage.
This assumption might be valid for select() but not for poll() and epoll() where (POLLERR | POLLHUP | POLLNVAL) is an alias for "connection lost", and that's an event which both Twisted and Tornado take into account and treat differently than a pure "read" event.
The same assumption should also apply for kqueue().

the exceptions set isn't passed to select() anymore: exception events (like OOB) are already reported in the read or write sets

That's true only if SO_OOBINLINE has been pre-emptively set against the socket.
It might be ok to ignore select()'s exceptional events anyway though.
This is probably another thing you might want to decide later.

there was something missing/annoying in the API [...] now, it only accepts only file descriptors, and also an optional opaque 'data' object

I like this change as it avoids maintaining a separate fd->handler map.

Will add line-by-line comments by using the review tool.

giampaolo · 2013-01-05T15:22:39Z

Inline comments:
http://bugs.python.org/review/16853/

neologix · 2013-01-05T15:30:11Z

> there's no SELECT_ERR anymore [...] the error will be reported when the FD will be read/written

I don't think this is a good idea, particularly during this early stage.
This assumption might be valid for select() but not for poll() and epoll() where (POLLERR | POLLHUP | POLLNVAL) is an alias for "connection lost", and that's an event which both Twisted and Tornado take into account and treat differently than a pure "read" event.

POLLERR and POLLNVAL semantics can be really different from
"connection lost", so this mapping sounds wrong. The actual error
(which may be ECONNRESET, but also EBADF, ENOBUFS...) will be raised
upon the next call to read()/write(), or can be retrieved through
getsockopt(SO_ERROR).

Also, that's exactly what the Linux kernel does for select():
http://lxr.free-electrons.com/source/fs/select.c#L381
POLLHUP is mapped to the read-set, POLLERR both to the read and write set.

This means that "connection lost" could never be returned on platforms
where the select() syscall is used, which could lead to subtle
portability problems.

But I'm definitely open on this (I've based the behavior on libevent
and Java, FWIW).

> the exceptions set isn't passed to select() anymore: exception events (like OOB) are already reported in the read or write sets

That's true only if SO_OOBINLINE has been pre-emptively set against the socket.
It might be ok to ignore select()'s exceptional events anyway though.

Thanks, I didn't know: since the POSIX man pages was rather evasive on
this, I checked the Linux kernel source code, but apparently it's
non-standard. If we do ignore OOB in select(), then we probably also
want to suppress POLLPRI|EPOLLPRI, no?

giampaolo · 2013-01-05T16:28:53Z

The actual error (which may be ECONNRESET, but also EBADF, ENOBUFS...) will be raised
upon the next call to read()/write(), or can be retrieved through
getsockopt(SO_ERROR).

Mmmm... I would be tempted to think the same but the fact that both Tornado and Twisted distinguish between READ and ERROR events is suspicious.

then we probably also want to suppress POLLPRI|EPOLLPRI, no?

Yes.

gvanrossum · 2013-01-05T19:20:23Z

I think that this needs extensive tests that verify the behavior of many end cases, including under duress (e.g. when there are too many connections for the kernel to handle). That would seem the only way to make sure that the code is reliable across platforms. It is likely that you could borrow some ideas for test scenarios from Twisted.

neologix · 2013-01-05T20:23:27Z

I think that this needs extensive tests that verify the behavior of many end cases, including under duress (e.g. when there are too many connections for the kernel to handle). That would seem the only way to make sure that the code is reliable across platforms. It is likely that you could borrow some ideas for test scenarios from Twisted.

Will do.

I'm adding a new version taking into account some of Giampaolo's remarks.

Also, the API now also allows passing a file descriptor or any object
with a fileno() method, since it will likely be useful.

To sum up, the API is:

    def register(self, fileobj, events, data=None):
        """Register a file object.

    Parameters:
    fileobj -- file object
    events  -- events to monitor (bitwise mask of SELECT_IN|SELECT_OUT)
    data    -- attached data
    """

    def unregister(self, fileobj):
        """Unregister a file object.

    Parameters:
    fileobj -- file object
    """

    def modify(self, fileobj, events, data=None):
        """Change a registered file object monitored events or attached data.

    Parameters:
    fileobj -- file object
    events  -- events to monitor (bitwise mask of SELECT_IN|SELECT_OUT)
    data    -- attached data
    """

    def select(self, timeout=None):
        """Perform the actual selection, until some monitored file objects are
        ready or a timeout expires.

    Parameters:
    timeout -- if timeout > 0, this specifies the maximum wait time, in
               seconds
               if timeout == 0, the select() call won't block, and will
               report the currently ready file objects
               if timeout is None, select() will block until a monitored
               file object becomes ready

    Returns:
    list of (fileobj, events, attached data) for ready file objects
    `events` is a bitwise mask of SELECT_IN|SELECT_OUT

Selector.select() output looks a lot like poll()/epoll() except for
two details: the output is the file object, and not the file
descriptor (poll()/epoll() are unfortunately inconsistent in this
regard), and there's a third field, the attached data (will be None if
not provided in register()/modify()). I think that this optional field
is really useful to pass e.g. a callback or some context information.

meadori · 2013-01-05T21:05:59Z

The EpollSelector and PollSelector implementations are basically identical. You might want to refactor these to share more code.

neologix · 2013-01-06T00:28:54Z

I've noticed a bug present in Tulip, pyftpdlib and tornado (and this
implementation, too): none of them passes the maxevents argument to
epoll.poll(), and by default the poll() wrapper in the socket module
uses FD_SETSIZE-1, so you'll never get more than FD_SETSIZE-1 events.

Note that the default value used by epoll() is probably too low (we
could maybe use an heuristic to use RLIMIT_NOFILE hard limit, with
capping because it can be quite large on FreeBSD or the Hurd ;-), or
at least the documentation doesn't warn users enough about this. The
signature reads:
"""
.. method:: epoll.poll(timeout=-1, maxevents=-1)

Wait for events. timeout in seconds (float)
"""

Which confuses users into thinking that the number of returned events
is unlimited by default.

The EpollSelector and PollSelector implementations are basically identical. You might want to refactor these to share more code.

Yes, but I find the current code much easier to read: epoll() and
poll() have different constants, different units for timeout, epoll()
must accept a maxevents argument, have a close() method, etc.

gvanrossum · 2013-01-06T00:59:10Z

I am trying to use this module in Tulip instead of its pollster implementation, and am running into a problem. Tulip defines separate add_reader()/add_writer() methods, which call to the pollster's register_reader()/register_writer(), respectively. The translation of this to the selector's API is somewhat complex; I would have to keep track of whether I already have a reader or writer registered, and then decide whether to call register() or modify(). If you don't want to change the API back to separate register_*() methods for readers and writers, perhaps you can add a method that tells me, for a given fileobj, whether it is registered, and with which poll flags (SELECT_IN/OUT/both) and the user data?

Also, I need a method that gives me the number of registered FDs.

neologix · 2013-01-06T21:36:14Z

If you don't want to change the API back to separate register_*() methods for readers and writers, perhaps you can add a method that tells me, for a given fileobj, whether it is registered, and with which poll flags (SELECT_IN/OUT/both) and the user data.

Also, I need a method that gives me the number of registered FDs.

Does that look OK?

    def get_info(self, fileobj):
        """Return information about a registered file object.

    Returns:
    (events, data) associated to this file object

    Raises KeyError if the file object is not registered.
    """

    def registered_count(self):
        """Return the number of registered file objects.

    Returns:
    number of currently registered file objects
    """

giampaolo · 2013-08-30T13:39:47Z

That should be pretty easy to add by someone who has access to a
Solaris box: I could use the buildbots, but it'd take a couple
iterations to get it right.

I can probably help with that.

giampaolo · 2013-08-30T21:04:13Z

I'm playing with devpoll on Open Solaris and it seems it works exactly as poll(). Basically a copy & paste of PollSelector class with a different self._poll attribute will suffice.
With dev/poll support you now have 3 classes (PollSelector, EpollSelector, DevpollSelector) which look basically the same except for very little details. It looks like there's space for some code reuse / refactoring.

gvanrossum · 2013-08-30T23:22:21Z

On Fri, Aug 30, 2013 at 6:34 AM, Charles-François Natali
<report@bugs.python.org> wrote:
[Giampaolo Rodola']

> - I'm not sure why in case of EINTR you retry with a different timeout; can't you just return []?

Because when I do:
selector.select(10)

I expect the selector to wait 10 seconds if no event occurs: the fact
that a signal comes in shouldn't break the contract made by the API,
i.e. that it will return when a FD is ready or the timeout expires.

I disagree. In this case it is almost certain that there is already
some kind of retry loop around the Selector.select() call, so there is
no need for the retry. (Certainly this is how it happens in Tulip.) In
the case of select() the contract is actually quite clear, and there
shouldn't be any problem with returning an empty list early. (This is
quite different from e.g. an interrupted read() or recv() call, where
there is no existing return value that you can return safely, since
their users don't expect None and will usually treat an empty string
as an EOF indicator.)

Early return can lead to spurious errors: just imagine you send a
request to a busy server: it would be bad to raise a timeout error
just because the user put the client in the background with CTRL-Z
(which results in SIGSTOP).

Yeah, and Tulip's retry handles this just fine.

I also noticed that you are hardcoding a dependency on
time.monotonic() if it exists. If it weren't for your desire to "fix"
this issue you wouldn't have needed a time function at all!

> - this is probably because I'm paranoid about performances but given that select() method will be called repeatedly I would not use a decorator. Also, similarly to what has been done elsewhere in the stdlib, for "critical" parts I would recommend localizing variable access in order to minimize overhead as in:
>
> def select(self, timeout=None):
> ...
> key_from_fd = self._key_from_fd
> ready_append = ready.append
> for fd in r | w:
> ...
> key = key_from_fd(fd)
> if key:
> ready_append((key, events & key.events))

I find that localizing variables leads to unreadable code, and is tied
to the current CPython interpreter implementation: such optimizations
belong to the interpreter, not user code.

As for the decorator performance overhead, I don't think it weights
much compared to the cost of a syscall (+ GIL acquire/release):
with decorator:
$ ./python -m timeit -s "from selectors import DefaultSelector,
EVENT_WRITE; import os; s = DefaultSelector();
s.register(os.pipe()[1], EVENT_WRITE)" "s.select()"
100000 loops, best of 3: 3.69 usec per loop
without decorator:
$ ./python -m timeit -s "from selectors import DefaultSelector,
EVENT_WRITE; import os; s = DefaultSelector();
s.register(os.pipe()[1], EVENT_WRITE)" "s.select()"
100000 loops, best of 3: 3.52 usec per loop

That's a 4% overhead, with a single FD that's always ready (and I
suspect that most of the overhead is due to the call to time(), not
the decorator per se).

But still I agree with Giampaolo that the decorator doesn't feel
right. It's like a try/except around the entire body of your function,
which is usually a code smell (even though I realize the exception it
catches can only be raised by the syscall). Please go back to the
explicit try/except returning [].

Also, I'll shortly propose a patch to handle EINTR within C code, so
those EINTR wrappers won't be needed anymore.

Hm, it may be a while before everyone is comfortable with that change.
Who knows what might break?

neologix · 2013-08-31T07:32:07Z

2013/8/31 Guido van Rossum <report@bugs.python.org>:

But still I agree with Giampaolo that the decorator doesn't feel
right. It's like a try/except around the entire body of your function,
which is usually a code smell (even though I realize the exception it
catches can only be raised by the syscall). Please go back to the
explicit try/except returning [].

Well, I'll change it, but:
We have to change the select() docstring to "Wait until some
registered file objects become ready, or for *at most* timeout
seconds"

All user code that wants to handle EINTR or correct timeout (which
should be "all user code" in a perfect world) will have to wrap the
call to select() inside a loop equivalent to this decorator, and I
personally like the a rule of "never let the user do what the library
can do for him".
This includes for example multiprocessing and telnetlib which
currently implement this EINTR handling loop with timeout computation,
see e.g.:
http://hg.python.org/cpython/file/acc7439b1406/Lib/telnetlib.py#l297

And other places in the stdlib either don't handle EINTR properly, or
don't re-compute timeout (see e.g. http://bugs.python.org/issue12338).

Also, I don't really see the problem with retrying upon EINTR, since:

the signal handler will be called right away anyway
in case where the select() is a part of an event loop, the event
loop registers a wakup file descriptor with the selector, which means
that upon interesting signal delivery, the select() call will return
to the schedulor

So in short, I don't see how that could be a nuisance, but I can
certainly see how this could trick user code into raising spurious
timeout-related errors, or be much more complicated than it ought to
be (the above telnetlib example is IMO a great example of why EINTR
and timeout computation should be handled at the selector level).

neologix · 2013-08-31T07:54:36Z

Oh, and FWIW, that's also what Java does:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5100121
http://hg.openjdk.java.net/jdk6/jdk6-gate/jdk/file/6daa81bdfd18/src/solaris/native/sun/nio/ch/PollArrayWrapper.c

(I know some people don't like Java, but I've found that they have
some really competent people and an extensive real-life exposure and
feedback).

giampaolo · 2013-08-31T12:09:35Z

I may be missing something here but isn't the whole point of EINTR to interrupt a potentially long running syscall?
Why protract function return time as a consequence?
All the event loops I'm familiar with (twisted, tornado and asyncore) just 'return' on EINTR, supposedly because that was the best compromise to support syscall interruption without forcing users to handle EINTR themselves.

pitrou · 2013-08-31T12:23:07Z

I may be missing something here but isn't the whole point of EINTR to
interrupt a potentially long running syscall?

Not exactly. The point is to signal (!) that a signal was received.
The received signal is not necessarily expected.

Also, from what I understand, it is more reliable to use a wakeup fd
(using e.g. signal.set_wakeup_fd) rather than expect EINTR to be
returned:
http://mail.python.org/pipermail/python-dev/2013-August/128204.html

pitrou · 2013-08-31T12:26:00Z

All the event loops I'm familiar with (twisted, tornado and asyncore)
just 'return' on EINTR, supposedly because that was the best
compromise to support syscall interruption without forcing users to
handle EINTR themselves.

What do you mean? Tornado simply retries on EINTR:
https://github.com/facebook/tornado/blob/master/tornado/ioloop.py#L652

neologix · 2013-08-31T12:28:39Z

I may be missing something here but isn't the whole point of EINTR to interrupt a potentially long running syscall?

No.
EINTR is an artifact of the early Unix design, because failing when a
signal was delivered was simpler than restarting the syscall.
Nowadays, most syscalls are restarted by default (that's was true on
BSD, and nowadays Linux also follows this trend).
See e.g. http://lkml.indiana.edu/hypermail/linux/kernel/0104.0/0743.html
and http://man7.org/linux/man-pages/man7/signal.7.html

One reason why EINTR can occur so often in CPython is because we set
up signal handlers without SA_RESTART (which is set by default by
signal(3)). The reason is likely that we want the syscall to return to
be able to call PyCheckSignals() (since we cannot run Python code from
a signal handler).
But since in this case, PyCheckSignals() is called anyway from the
main eval loop, signal handlers will still be run in a timely manner
(and as noted, the proper way to handle signals is to use a wakeup
file descriptor which can be used by select()).

gvanrossum · 2013-08-31T15:14:18Z

Charles-François Natali added the comment:

Well, I'll change it, but:
We have to change the select() docstring to "Wait until some
registered file objects become ready, or for *at most* timeout
seconds"

Doesn't sound so bad to me. It's about the same guarantee as sleep().

All user code that wants to handle EINTR or correct timeout (which
should be "all user code" in a perfect world) will have to wrap the
call to select() inside a loop equivalent to this decorator, and I
personally like the a rule of "never let the user do what the library
can do for him".

No, they'll use a framework with an event loop like Tulip that takes
care of all of this for them. Such frameworks all have logic in them
to handle early returns from the low-level select/poll operation.

I find it easier to reason about the correctness of the framework's
code without your decorator:

(1) I have to convince myself that the code wrapped by your decorator
doesn't change any global state, *or* that there is a guarantee that
the exception caught is only raised by the select()/poll()/etc.
syscall, not anywhere else in the wrapped method.

(2) I have to remember that if a signal handler is called that
modifies the event loop's deadline, the selector will return
immediately anyway (so the event loop can recalculate its deadline)
because of the self-pipe.

(3) I have to prove that your decorator uses the same clock as my framework.

(4) I have to prove that your code does the same thing if the process
is suspended for a really long time and the system clock changes in
the meantime.

This includes for example multiprocessing and telnetlib which
currently implement this EINTR handling loop with timeout computation,
see e.g.:
http://hg.python.org/cpython/file/acc7439b1406/Lib/telnetlib.py#l297

And other places in the stdlib either don't handle EINTR properly, or
don't re-compute timeout (see e.g. http://bugs.python.org/issue12338).

Also, I don't really see the problem with retrying upon EINTR, since:

the signal handler will be called right away anyway

in case where the select() is a part of an event loop, the event
loop registers a wakup file descriptor with the selector, which means
that upon interesting signal delivery, the select() call will return
to the schedulor

So in short, I don't see how that could be a nuisance, but I can
certainly see how this could trick user code into raising spurious
timeout-related errors, or be much more complicated than it ought to
be (the above telnetlib example is IMO a great example of why EINTR
and timeout computation should be handled at the selector level).

The selector is a helper for higher-level frameworks. I think the
EINTR handling belongs in the framework.

neologix · 2013-08-31T17:06:55Z

Here's a patch returning [] on EINTR.

I've tested it on Linux, Windows, FreeBSD, OpenSolaris and OS-X.

pitrou · 2013-08-31T17:15:46Z

Here's a patch returning [] on EINTR.

Hmm... Shouldn't it simply let the InterruptedError bubble up?
I know there's the sleep() precedent, but I find it a bit embarassing actually: the call can return early, but there's no way for the caller to know that it returned early.

gvanrossum · 2013-08-31T17:26:47Z

Hm... I like the [] return just fine, it matches what Tulip's selector did. The framework should be re-checking the clock anyway -- it's possible that select() returned late too...

python-dev · 2013-09-04T17:05:46Z

New changeset e4d45315c38c by Charles-François Natali in branch 'default':
Issue bpo-16853: Add new selectors module.
http://hg.python.org/cpython/rev/e4d45315c38c

vstinner · 2013-09-04T18:27:18Z

New changeset e4d45315c38c by Charles-François Natali in branch 'default':
Issue bpo-16853: Add new selectors module.
http://hg.python.org/cpython/rev/e4d45315c38c

Great! Congrats Charles-François for the new module! I tried to
implement such module once but I gave up because it was more complex
than what I expected.

python-dev · 2013-09-04T18:30:49Z

New changeset 6b14ebe0f7ac by Victor Stinner in branch 'default':
Issue bpo-16853: Mention the new selectors module in What's New in Python 3.4
http://hg.python.org/cpython/rev/6b14ebe0f7ac

gvanrossum · 2013-09-04T18:31:46Z

Agreed, this was a great feat of implementation and API design.

Is there anything left to do or can we close this issue?

(I looked at the Proactor code in Tulip again and I think it is not quite as easy to isolate it, so I'm not requesting that Proactors be added to the stdlib at this time.)

vstinner · 2013-09-04T18:35:44Z

Is there anything left to do or can we close this issue?

It would be nice to mention the new selectors module in the documentation of the select module.

pitrou · 2013-09-04T18:37:14Z

> Is there anything left to do or can we close this issue?

It would be nice to mention the new selectors module in the documentation of the select module.

Feel free to make documentation commits :) They don't necessarily have
to pass a review stage.

python-dev · 2013-09-04T18:40:27Z

New changeset 142ff216e4b4 by Victor Stinner in branch 'default':
Issue bpo-16853: Mention the new selectors module in the select module
http://hg.python.org/cpython/rev/142ff216e4b4

vstinner · 2013-09-04T18:45:34Z

Feel free to make documentation commits :)

Ok, done.

I also created the issue bpo-18923: "Use the new selectors module in the subprocess module".

pitrou · 2013-09-04T18:47:06Z

I feel this can be closed then :) Thanks, Charles-François!

giampaolo · 2013-09-04T18:48:10Z

Is there anything left to do or can we close this issue?

Add support for select.devpoll on Solaris.

neologix · 2013-09-04T18:49:26Z

Is there anything left to do or can we close this issue?

I usually wait until the test is run on the buildbots to check I didn't break anything: apparently that's OK, so we can close.

It would be nice to mention the new selectors module in the documentation of the select module.

Indeed (I mentioned the select module in selectors' doc, but the contrary is certainly useful to point users to this new module), thanks for your commit.

Where were you when I asked for documentation review ;-)

giampaolo · 2013-09-04T21:10:28Z

I have realized just now that at least epoll() and kqueue() selectors could take advantage and define their own modify() method (there was even a TODO I totally missed).
Anyway, from now on I'm gonna file separate issues.

neologix · 2013-09-05T08:28:00Z

Giampaolo Rodola' added the comment:

I have realized just now that at least epoll() and kqueue() selectors could take advantage and define their own modify() method (there was even a TODO I totally missed).

As a side note, in the general case, there's more than a performance
optimization: the problem with unregister() + register() vs a real
modify (e.g. EPOLL_CTL_MOD) is that it's subject to a race condition,
if an event fires during the short window where the FD isn't
registered anymore.

But since selectors use level-triggered notification, that's not an
issue: I'll probably add a comment in the code explaining why it's
safe, though.

sbt · 2013-09-05T13:09:02Z

On 05/09/2013 9:28am, Charles-François Natali wrote:

As a side note, in the general case, there's more than a performance
optimization: the problem with unregister() + register() vs a real
modify (e.g. EPOLL_CTL_MOD) is that it's subject to a race condition,
if an event fires during the short window where the FD isn't
registered anymore.

Even with edge-triggered notification, doesn't registering an fd check
the current readiness:

 >>> import os, select
 >>> r, w = os.pipe()
 >>> p = select.epoll()
 >>> os.write(w, b"h")
 >>> p.register(r, select.EPOLLIN | select.EPOLLET)
 >>> p.poll(0)
[(3, 1)]
 >>> p.poll(0)
[]

Otherwise it would be very difficult to edge-triggered notification.

neologix · 2013-09-05T16:42:34Z

Richard Oudkerk added the comment:

Even with edge-triggered notification, doesn't registering an fd check
the current readiness:

Hum...
Looks like I forgot to turn my brain on this morning :-)

neologix mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 3, 2013

pitrou closed this as completed Sep 4, 2013

ezio-melotti transferred this issue from another repository Apr 10, 2022

add a Selector to the select module #61057

add a Selector to the select module #61057

Comments

neologix mannequin commented Jan 3, 2013

neologix mannequin commented Jan 3, 2013

felipecruz mannequin commented Jan 3, 2013

pitrou commented Jan 3, 2013

neologix mannequin commented Jan 3, 2013

neologix mannequin commented Jan 3, 2013

felipecruz mannequin commented Jan 3, 2013

neologix mannequin commented Jan 3, 2013

gvanrossum commented Jan 4, 2013

neologix mannequin commented Jan 4, 2013

gvanrossum commented Jan 4, 2013

pitrou commented Jan 4, 2013

giampaolo commented Jan 4, 2013

gvanrossum commented Jan 4, 2013

rosslagerwall mannequin commented Jan 4, 2013

neologix mannequin commented Jan 4, 2013

neologix mannequin commented Jan 4, 2013

rosslagerwall mannequin commented Jan 4, 2013

neologix mannequin commented Jan 5, 2013

giampaolo commented Jan 5, 2013

giampaolo commented Jan 5, 2013

neologix mannequin commented Jan 5, 2013

giampaolo commented Jan 5, 2013

gvanrossum commented Jan 5, 2013

neologix mannequin commented Jan 5, 2013

meadori commented Jan 5, 2013

neologix mannequin commented Jan 6, 2013

gvanrossum commented Jan 6, 2013

neologix mannequin commented Jan 6, 2013

giampaolo commented Aug 30, 2013

giampaolo commented Aug 30, 2013

gvanrossum commented Aug 30, 2013

neologix mannequin commented Aug 31, 2013

neologix mannequin commented Aug 31, 2013

giampaolo commented Aug 31, 2013

pitrou commented Aug 31, 2013

pitrou commented Aug 31, 2013

neologix mannequin commented Aug 31, 2013

gvanrossum commented Aug 31, 2013

neologix mannequin commented Aug 31, 2013

pitrou commented Aug 31, 2013

gvanrossum commented Aug 31, 2013

python-dev mannequin commented Sep 4, 2013

vstinner commented Sep 4, 2013

python-dev mannequin commented Sep 4, 2013

gvanrossum commented Sep 4, 2013

vstinner commented Sep 4, 2013

pitrou commented Sep 4, 2013

python-dev mannequin commented Sep 4, 2013

vstinner commented Sep 4, 2013

pitrou commented Sep 4, 2013

giampaolo commented Sep 4, 2013

neologix mannequin commented Sep 4, 2013

giampaolo commented Sep 4, 2013

neologix mannequin commented Sep 5, 2013

sbt mannequin commented Sep 5, 2013

neologix mannequin commented Sep 5, 2013