This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Nir Soffer, giampaolo.rodola, vstinner, walkhour
Date 2017-07-27.10:11:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1501150288.86.0.562044123793.issue30931@psf.upfronthosting.co.za>
In-reply-to
Content
Ok, wow, the discussion on this issue, bpo-30985, PR 2854, PR 2804, etc. was very productive. We identified much more race conditions than the first one that I spotted.


(Bug 1) The dispatcher A closes the dispatcher B. Currently, asyncore calls the handlers of the dispatcher B.

Technically, the close() method of a dispatcher closes immediately a socket object. So it's very likely that handlers of the dispatcher B fails on recv(), send(), or any other operation on the socket.

There is also a risk that somehow dispatcher B tries to closes again the socket (technically, socket, pipe or something else) and gets a OSError(EBADF) or worse: closes an unrelated socket!


(Bug 2) Similar to (Bug 1), but the subtle coming from the current implementation causes even worse bugs. The dispatcher A closes the dispatcher B, and then the dispatcher A (or another dispatcher, it doesn't matter who make that) creates a new dispatcher C which reuses the same file descriptor than the old closed file descriptor of dispatcher B.

The consequence is that handlers of dispatcher C are called even if the socket is not ready to read or to write. Since sockets are non-blocking, recv() and send() are likely to fail with BlockingIOError. These errors are usually correctly handled in asynchronous code, but they can cause bugs. More generally, calling handlers when the socket is not ready can cause many kinds of bugs.


(Bug 3) If a socket is ready to read and ready to write, handle_read() and handle_write() handlers are expected to be called. The problem is when handle_read() (first called handler) closes the dispatcher.

It was decided to not call handle_write() is that case.


(Bug 4) Ok, this one is the best one :-) When a socket is ready, asyncore calls methods like handle_write_event(). These methods usually call a single handler like handle_write(), *but* sometimes can call multiple handlers like handle_connect() followed by handle_write(). If a first handler closes the dispatcher, the following handler may fail badly on trying to use a closed socket.

IMHO (Bug 3) and (Bug 4) are less surprising than (Bug 1) or (Bug 2) since they are determistic and so less "race condition". Since these bugs are determistic, I expect that applications are already written to handle these corner cases, somehow. The simplest option is to catch and ignore errors, asyncore already ignores many errors for us. For example asyncore already ignores EBADF, whereas IMHO it hides a design issue (the 4 bugs I just listed).

Technically, fixing (Bug 3) using my PR 2854 design (check if map[fd] is still the same dispatcher) is safe since it doesn't rely at all on the actual implementation of asyncore handle_xxx_event() (which can be overriden) not handle_xxx() methods defined by third party code.

Fixing (Bug 4) is more complex since it requires to decide how to detect that a dispatcher was closed (it seems like "dispatcher._fileno is None" test is the best choice), and to modify handle_xxx_event() methods, whereas these methods can and *are* overridden in subclasses.

--

About backward compatibility, in short, fixing any of these bug changes the exact behaviour, since fixing all these bugs require to *not* call a handler if the dispatcher was closed. *If* a specific application requires that handlers are called anyway, changes are likely to break these applications.

A workaround for these applications is to copy asyncore.py from an old Python version to keep the exact same behaviour. Another smoother workaround is to monkey-patch poll() and poll2(), and maybe also override handle_xxx_event() methods, to restore the old behaviour.

I don't want to add a flag to re-enable the old behaviour, since it seems like we all agree that the described bugs are clearly very bad design bugs and that they must be fixed.

--

Another option, Nir will probably not like it, is to consider that asyncore is dead and will not be fixed.

That's not my favorite option. Nir explained that on Python 2.7, asyncore is the most convenient module for his use case, he deeply rely on it, and so "just want" to fix know bugs.

--

I suggest to use this bpo to fix (Bug 1), (Bug 2) and (Bug 3), since they can all be fixed in poll() and poll2() with the same design, like my PR 2854.

For (Bug 4), I suggest to use bpo-30985 where we already discussed many options how to detect that a dispatcher was closed. I have no strong opinion on this bug, if it should be fixed or not. IMHO it's less important and applications already know how handle the bug.
History
Date User Action Args
2017-07-27 10:11:28vstinnersetrecipients: + vstinner, giampaolo.rodola, Nir Soffer, walkhour
2017-07-27 10:11:28vstinnersetmessageid: <1501150288.86.0.562044123793.issue30931@psf.upfronthosting.co.za>
2017-07-27 10:11:28vstinnerlinkissue30931 messages
2017-07-27 10:11:28vstinnercreate