Message 179201 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	neologix
Recipients	neologix, pitrou, sbt
Date	2013-01-06.18:14:47
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CAH_1eM2ngu2uQWAEgXzgvRxdSLP_pODnMyWJBO5Wv-LX5eAYVw@mail.gmail.com>
In-reply-to	<1357487136.23.0.795047316444.issue16873@psf.upfronthosting.co.za>

Content
> The program does not demonstrate starvation because you are servicing the resource represented by the "starved" duplicate fds before calling poll() again. No. What the program does is the following: while all the write FDs have not been returned by epoll.wait() at least once: 1) make all writer FDs ready by draining the pipe 2) fetch the list of event reported by epoll(), and add them to the list of seen FDs 3) make all the writers FDs not ready by filling the PIPE 4) fetch the list of event reported by epoll(), and add them to the list of seen FDs With the default 'maxevents' parameters, it never completes. This shows that the FDs returned at step 2 are actually a strict subset of all the ready FDs, and this forever: some FDs which are actually ready (they are all ready at step 2) are never reported. This is starvation. By increasing epoll.wait() maxevents, it completes immediately. > You are creating thousands of duplicate handles for the same resource and then complaining that they do not behave independently! The fact that that the FDs are duped shouldn't change anything to the events reported: it works while the number of FDs is less than FD_SETSIZE (epoll_wait() maxevents argument). See also epoll() documentation: """ Q0 What is the key used to distinguish the file descriptors registered in an epoll set? A0 The key is the combination of the file descriptor number and the open file description (also known as an "open file handle", the kernel's internal representation of an open file). Q1 What happens if you register the same file descriptor on an epoll instance twice? A1 You will probably get EEXIST. However, it is possible to add a duplicate (dup(2), dup2(2), fcntl(2) F_DUPFD) descriptor to the same epoll instance. This can be a useful technique for filtering events, if the duplicate file descriptors are registered with different events masks. """ I just used dup() to make it easier to test, but you'll probably get the same thing it your FDs were sockets connected to different endpoints. > I tried modifing your program by running poll() in a loop, exiting when no more unseen fds are reported as ready. This makes the program exit immediately. > > So > > ready_writers = set(fd for fd, evt in > ep.poll(-1, MAXEVENTS) if fd != r) > seen_writers \|= ready_writers > > becomes > > while True: > ready_writers = set(fd for fd, evt in > ep.poll(-1, MAXEVENTS) if fd != r) > if ready_writers.issubset(seen_writers): > break > seen_writers \|= ready_writers Of course it does, since the returned FDs are a subset of all the ready file descriptors. The point is precisely that, when there are more FDs ready than maxevents, some FDs will never be reported.

> The program does *not* demonstrate starvation because you are servicing the resource represented by the "starved" duplicate fds before calling poll() again.

No.
What the program does is the following:

while all the write FDs have not been returned by epoll.wait() at least once:
    1) make all writer FDs ready by draining the pipe
    2) fetch the list of event reported by epoll(), and add them to
the list of seen FDs
    3) make all the writers FDs not ready by filling the PIPE
    4) fetch the list of event reported by epoll(), and add them to
the list of seen FDs

With the default 'maxevents' parameters, it never completes.

This shows that the FDs returned at step 2 are actually a strict
subset of all the ready FDs, and this forever: some FDs which are
actually ready (they are all ready at step 2) are *never* reported.
This is starvation.

By increasing epoll.wait() maxevents, it completes immediately.

> You are creating thousands of duplicate handles for the same resource and then complaining that they do not behave independently!

The fact that that the FDs are duped shouldn't change anything to the
events reported: it works while the number of FDs is less than
FD_SETSIZE (epoll_wait() maxevents argument).

See also epoll() documentation:
"""

       Q0  What is the key used to distinguish the file descriptors
registered in an
           epoll set?

       A0  The key is the combination of the file descriptor number
and the open file
           description (also known as an "open file handle", the
kernel's internal
           representation of an open file).

       Q1  What happens if you register the same file descriptor on an
epoll instance
           twice?

       A1  You will probably get EEXIST.  However, it is possible to
add a duplicate
           (dup(2), dup2(2), fcntl(2) F_DUPFD) descriptor to the same
epoll instance.
           This can be a useful technique for filtering events, if the
duplicate file
           descriptors are registered with different events masks.
"""

I just used dup() to make it easier to test, but you'll probably get
the same thing it your FDs were sockets connected to different
endpoints.

> I tried modifing your program by running poll() in a loop, exiting when no more unseen fds are reported as ready.  This makes the program exit immediately.
>
> So
>
>         ready_writers = set(fd for fd, evt in
>                             ep.poll(-1, MAXEVENTS) if fd != r)
>         seen_writers |= ready_writers
>
> becomes
>
>         while True:
>             ready_writers = set(fd for fd, evt in
>                                 ep.poll(-1, MAXEVENTS) if fd != r)
>             if ready_writers.issubset(seen_writers):
>                 break
>             seen_writers |= ready_writers

Of course it does, since the returned FDs are a subset of all the
ready file descriptors.

The point is precisely that, when there are more FDs ready than
maxevents, some FDs will never be reported.

History
Date	User	Action	Args
2013-01-06 18:14:48	neologix	set	recipients: + neologix, pitrou, sbt
2013-01-06 18:14:48	neologix	link	issue16873 messages
2013-01-06 18:14:47	neologix	create