classification
Title: asyncore fixes and improvements
Type: Stage:
Components: None Versions:
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: josiahcarlson Nosy List: akuchling, alexer, calvin, giampaolo.rodola, janssen, josiahcarlson, klimkin, loewis
Priority: normal Keywords: patch

Created on 2004-03-03 13:07 by klimkin, last changed 2009-03-31 22:58 by giampaolo.rodola. This issue is now closed.

Files
File name Uploaded Description Edit
asyn-20040702.tar.tbz2 klimkin, 2004-07-02 13:44 asyncore.py asynchat.py itself and tools.
asyn-20050227.tbz2 klimkin, 2005-02-26 21:37 New features + asynhttplib.
Messages (27)
msg45446 - (view) Author: Alexey Klimkin (klimkin) Date: 2004-03-03 13:07
Minor:
* 0/1 for boolean values replaced with False/True.
* (887279) Added handling of POLLPRI as POLLIN.
POLLERR, POLLHUP,
POLLNVAL are handled as exception event.
handle_expt_event gets recent
error from self.socket object and raises socket.error.
* Default readable()/writable() returns False.
* Added "map" parameter for file_dispatcher.
* file_wrapper: removed "return" in close(), recv/read
and send/write
swapped because of their nature.
* mac code for writable() removed. Manual for accept()
on mac is similar
to the one on linux.
* Repeating exception changed from "raise socket.error,
why" to raise.
* Added connected/accepting/addr reset on close().
Initialization of
variables moved to __init__.
* close_all() now calls close for dispatcher object,
EBADF treated
as already closed socket/file.
* Added channel id to "unhandled..." messages.

Bugs:
* Fixed bug (654766,889153): client never gets
connected, nor errored.
Connecting client gets writable event from select(),
however, some client may want always be non writable.
Such client may
never get connected. The fix adds _readable() - always
True for
accepting and always False for connecting socket; and
_writable() -
always False for accepting and always True for
connecting socket.
This implies, that listening dispatcher's readable()
and writable()
will never be called. ("man accept" and "man connect"
for non-blocking
sockets).
* Fixed bug: error handling after accept().
It's said, that accept can return EWOULDBLOCK even for
readable socket.
This mean, that even after handle_accept(),
dispatcher's accept() still
raise EWOULDBLOCK. New code does accept() itself and
stores accepted
socket in self.__pending_accept. If there was
socket.error, it's treated
as EWOULDBLOCK. dispatcher's accept returns
self.__pending_accept and
resets it to None.

Features:
* Added pending_read() and pending_write(). The
functions helps to use
dispatcher over non socket objects with buffering
capabilities. In original
dispatcher, if socket makes buffered read and some data
is in buffer, entering
asyncore.poll() doesn't finishes, since there is no
data in real file/socket.
This feature allow to use SSL socket, since the socket
reads data by 16k chunks.
msg45447 - (view) Author: Bastian Kleineidam (calvin) Date: 2004-03-11 15:49
Logged In: YES 
user_id=9205

There is no file attached! You have to click on the checkbox
next to the upload filename. This is a Sourceforge annoyance :(
msg45448 - (view) Author: Alexey Klimkin (klimkin) Date: 2004-03-17 07:15
Logged In: YES 
user_id=410460

Sorry, unfortunately I have lost old patch file. I have
atached new one.
In addition to fixes, listed above, the patch includes:

1. Fix for operating on uninitialized socket. self.socket
now initializes with _closed_socket(), so any operation
throws EBADF.
2. Added class idispatcher - base class for dispatcher. The
purpose of this class is to allow simple replacement of
media(dispatcher interface) in classes, derived from
dispatcher class. This is based on 'object'.

I have also attached asynchat.diff - example for new-style
dispatcher. Old asynchat works as well.
msg45449 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 19:48
Logged In: YES 
user_id=11375

The many number of changes in this patch make it difficult to 
figure out which changes fix which problem.  I've created a new 
directory in CVS, nondist/sandbox/asyncore, that contains copies of 
the module with these patches applied, and will work on applying 
changes to the copy in dist/src.
msg45450 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 19:55
Logged In: YES 
user_id=11375

Fix for bug #887279 applied to HEAD.
msg45451 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 20:02
Logged In: YES 
user_id=11375

Patch to use True/False applied to HEAD.
msg45452 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 20:02
Logged In: YES 
user_id=11375

Mac code for writable() removed from HEAD.
msg45453 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 20:08
Logged In: YES 
user_id=11375

Repeating exception changes ('raise socket.error' -> just 'raise')
checked into HEAD.
msg45454 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 20:13
Logged In: YES 
user_id=11375

Added "map" parameter for file_dispatcher and 
dispatcher_with_send in CVS HEAD.
msg45455 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-03-21 20:18
Logged In: YES 
user_id=11375

In your version of file_dispatch.__init__, the .set_file() call is 
moved earlier; can you say why?
msg45456 - (view) Author: Alexey Klimkin (klimkin) Date: 2004-03-22 06:15
Logged In: YES 
user_id=410460

There is no real reason for this change, please undo.
msg45457 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2004-06-05 17:54
Logged In: YES 
user_id=11375

I've struggled to get the test suite running without errors on my machine, 
but have failed.  
msg45458 - (view) Author: Alexey Klimkin (klimkin) Date: 2004-07-02 13:44
Logged In: YES 
user_id=410460

In addition to "[ 909005 ] asyncore fixes and improvements"
and CVS
version "asyncore.py,v 2.51" this patch provides:

* Added handling of buffered socket layer (pending_read(),
  pending_write()).
* Added fd number for __repr__.
* Initialized self.socket = socket._closedsocket() instead
of None
  for verbose error output (like closed socket.socket).
* asyncore and asynchat implements idispatcher and iasync_chat.
* Fixed self.addr initialization.
* Removed import exceptions.
* Don't filter KeyboardInterrupt, just pass through.
* Added queue of sockets, solves the problem of select() on
too many
  descriptors.

I have run make test in python cvs distrib without problems.
Examples of using i* included.
msg45459 - (view) Author: Alexey Klimkin (klimkin) Date: 2005-02-26 21:39
Logged In: YES 
user_id=410460

Minor improvements:

    * Added handle_close_event(): calls handle_close(), then 
closes channel. No need to write self.close() in each handle_close
().

    * Improved exception handling. KeyboardInterrupt is not 
blocked. For python exception handle_error_event() is called, 
which checks for KeyboardInterrupt and closes socket, if 
handle_error didn't.

Bugs:

    * Calling connect() could raise exception and doesn't hit 
handle_error(). Now if there was an exception, 
handle_error_event() is called.

Features:

    * set_timeout(): Sets timeout for dispatcher object, if there was 
no io for the object, raises ETIMEDOUT, which handled by 
handle_error_event().

    * Fixed issue with Windows - too many descriptors in select(). 
The list of sockets shuffled and only first asyncore.max_channels 
used in select().

    * Added set_prio(): Sets priority for dispatcher.  After shuffle 
the list of sockets sorted by priority.


You may also check asynhttplib - asynchronous version of httplib.
msg45460 - (view) Author: Josiah Carlson (josiahcarlson) * Date: 2007-01-07 04:42
Many of the changes in the source provided by klimkin in his most recent revision from February 27, 2005 seek to solve certain problems in an inconsistent or incorrect way.  Some of his changes (or variants thereof) are worthwhile.  I'll start with my issues with his asyncore changes, then describe what I think should be added from them.

For example, in his updated asyncore.py, the list of sockets is first shuffled randomly, then sorted based on priority.  Assuming that one ignored priorities for a moment, if there were more sockets than the max sockets for the platform, then due to the limitations of randomness, there would be no guarantees that all sockets would get polled.  Say, for example, that one were using windows and were running close to the actual select file handle limit (512 in Python 2.3) with 500 handles, you would skip 436 of the sockets *this pass*.  In 10 passes, there would have been 100 sockets that were never polled.  In 20 passes, there would still be, on average, 20 that were never polled.  So this "randomization" step is the wrong thing to do, unless you actually make multiple select calls for each poll() call.  But really, select is limited by 512, and I've run it with 500 without issue.

The priority based sorting has much of the same problems, but it is even worse when you have nontrivial numbers of differing priorities, regardless of randomization or not.

The max socket limit of 64 on Windows isn't correct.  It's been 512 since at least Python 2.3 .  And all other platforms being 65536?  No.  I've had some versions of linux die on me at 512, others at 4096, but all were dog slow beyond 500 or so.  It's better to let the underlying system raise an exception for the user when it fails and let them attempt to tune it, rather than forcing a tuning that may not be correct.


The "pending read" stuff is also misdirected.  Assuming a non-broken async client or server, either should be handling content as it comes it, dispatching as necessary.  See asynchat.collect_incoming_data() and asynchat.found_terminator() for examples.

The idispatcher stuff seems unnecessary.


Generally speaking, it seems to me that there are 3 levels of abstraction going on:
1) handle_*_event(), called by poll, poll2, etc.
2) handle_*(), called by handle_*_event(), user overrides, calls other handle_*() and *() methods
3) *() (aka recv, send, close, etc.), called by handle_*(), generally left alone.

Some of your code breaks the abstraction and has items in layer 2 call items in layer 1, which then call items in layer 2 again.  This seems unnecessary, and breaks the general downward calling semantic (except in the case of errors returned by layer 3 resulting in layer 2 handle_close() calls, which is the proper method to call).


There are, according to my reading of the asyncore portions of your included module, a few things that may be worthy for inclusion into the Python standard library are:

* A variant of your changes to close_all(), though it should proceed in closing everything unless a KeyboardInterrupt, SystemExit, or ExitNow exception is raised.  Socket errors should be ignored, because we are closing them - we don't care about their error condition.

* Checking sockets for socket error via socket.getsockopt() .

* A variant of your .close() implementation.

* The CONNRESET, etc., stuff in the send() and recv() methods, but not the handle_close_event() replacements, stick with handle_close() .

* Checking for KeyboardInterrupt and SystemExit inside the poll functions.

* The _closed_socket class and initialization.

All but the last of the above, I would consider to be bugfixes, and if others agree that these are reasonable changes, I'll write up a patch against trunk and 2.5 maintenance.  The last change, while I think would be nice, probably shouldn't be included in 2.5 maintenance, though I think would be fine for the trunk.
msg45461 - (view) Author: Josiah Carlson (josiahcarlson) * Date: 2007-01-07 04:53
In asynchat, the only stuff that should be accepted is the handle_read() changes.  The deque removal should be ignored (we have deques since Python 2.4, which are *significantly* faster than lists in nontrivial applications), the iasync_chat stuff, like the idispatcher stuff, seems unnecessary.  And that's pretty much it for asynchat.

The proposed asynchttp module shouldn't go into the Python standard library until it has lived on its own for a nontrival amount of time in the Cheeseshop and is found to be as good as httplib, urllib, or urllib2.  Even then, its inclusion should be questioned, as medusa (the http server based on asyncore) has been around for a decade or more, is used many places, and yet still isn't in the standard library.

The asyncoreTest.py needs a bit of work (I notice some incorrect names), but could be used as an addition to the test suite (currently it seems as though only asynchat is tested).
msg45462 - (view) Author: Alexey Klimkin (klimkin) Date: 2007-01-08 20:44
1) The patch was developed not during some academic research - but during of coding true non-blocking client-server applications, capable to run both on Linux and Windows. Original code had a lot of issues with everything: some parts were not truly blocking, not every socket can be passed, issues with high load, etc.
2) We have used medusa for ssl capability in our project. However, it's impossible to get fully non-blocking functionality with original asyncore and original medusa. So functionality was extended to support these features as well. That is what idispatcher for.
3) In the end we have got pretty reliable code, which supports features I described here and has tons of bug and issues fixed. Again, I didn't fix bug for any academic purpose - every fix was driven by real issue we met during development. I don't also think, that these fixes bond to our project too tight - I believe I made them pretty general.
4) It's possible, that some parts can be made better for other application. But if you follow the same path - developing truly non-blocking client-server with medusa's ssl capabilities, - I think, you will end with the same thing.
5) I don't insist on including the patch into the python tree as is. I feel pretty well using modified asyncore in my private library. My intention was to share my experience. Please use, if you need to.
6) The development I mention above was 2004 year. So the patch is not in sync with this reality for 2 years already. Some issues it was solving can be gone already. I also don't know, what is going on with SSL for python - there seems to be new libraries as well. 

...so... just use it as you want... or as you don't want ;) ...
msg45463 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-02-14 10:35
Alexey, are you interested in revising your code until it is approved? Saying "no" is fine; the question then is if there is anybody else interested in working on this patch. If nobody is interested in working on it, it would be best rejected - there is no point in having it listed as open here for years (it would be sad, of course, for the work put into the patch, and the work put into the review).

Andrew, do you have more changes from this patch that you consider worth incorporating?
msg45464 - (view) Author: Josiah Carlson (josiahcarlson) * Date: 2007-02-14 18:54
If anything is to be included, it should be the tests (though they need to be rewritten a bit).

I've been working on and off on a patch that includes other portions that I felt were worthwhile.  When I finish the patch, I'll also include a fixed version of the tests.
msg62033 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-02-04 05:47
I should point out that I'm doing a big project with SSL and Python,
using Medusa, and asyncore.  I've been re-working the 2.6 and 3.x SSL
support (with guidance from Giampolo :-) so that true async capability
is possible for SSL.
msg69221 - (view) Author: Josiah Carlson (josiahcarlson) * Date: 2008-07-03 18:11
I have applied my variant patch to trunk, which will be in 3.0 this weekend.
msg84912 - (view) Author: Aleksi Torhamo (alexer) Date: 2009-03-31 21:19
"not the handle_close_event() replacements, stick with handle_close()".
I'm guessing this has to do with "breaking the abstraction"?

I can't think of a situation where handle_close() is called, but close()
should not be called. If indeed so, i feel it's weird to require the
user remember to call close(), and it should IMHO be done automatically.
(I feel like i'm bitten by this each and every time i replace the
default handle_close().. :)

If the naming of handle_close_event() is not appropriate (as it "sounds"
like a layer 1 method), how about adding do_close(), and making other
places call that?

def do_close(self):
    self.close()
    self.handle_close()
msg84916 - (view) Author: Aleksi Torhamo (alexer) Date: 2009-03-31 21:27
I just remembered that "level 1" function handle_connect_event() is also
called from "level 2", so i actually can't see why the close helper
could not be called handle_close_event(). Is there some other reason
besides "breaking abstraction" to not introduce it?
msg84917 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2009-03-31 21:28
> I can't think of a situation where handle_close() is called, but close()
> should not be called. If indeed so, i feel it's weird to require the
> user remember to call close(), and it should IMHO be done automatically.

It's already done automatically if you don't override handle_close.
msg84919 - (view) Author: Aleksi Torhamo (alexer) Date: 2009-03-31 21:36
> It's already done automatically if you don't override handle_close.

Sorry, i meant the case where you need to override it. If we always need
to call close() from handle_close(), it feels redundant having to
remember to add it, when it could be done automatically instead. Why not
do it automatically, if every overriding user must otherwise always
remember to add it?
msg84932 - (view) Author: Josiah Carlson (josiahcarlson) * Date: 2009-03-31 22:06
Just to make this clear, Aleksi is proposing close() should be called 
automatically by some higher-level functionality whether a user has 
overridden handle_close() or not.

With the updated asyncore warning suppression stuff, overriding 
handle_close() for the sake of suppressing the warnings should no longer 
be necessary.

While I can see that it would be *convenient* if close() was 
automatically called, the method is called "handle_close()", and there 
is an expectation about the implementation thereof.  For example, you 
call socket.recv() in handle_read(), you call socket.send() in 
handle_write(), call socket.accept() in handle_accept().  Is it too much 
to expect that a user will call .close() inside handle_close()?

The answer to that last question is a "no", btw.
msg84941 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2009-03-31 22:58
I agree with Josiah but I must say that the handle_close() documentation
is a bit misleading.
Currently it states:

> handle_close()
>    Called when the socket is closed.


I'd change it with something like this:

"Called when the asynchronous loop detects that the connection on a
selectable object has been closed.
When overridden the user is supposed to explicitly call the close()
method to actually remove the channel from the global map ."
History
Date User Action Args
2009-03-31 22:58:58giampaolo.rodolasetmessages: + msg84941
2009-03-31 22:06:39josiahcarlsonsetmessages: + msg84932
2009-03-31 21:36:23alexersetmessages: + msg84919
2009-03-31 21:28:36giampaolo.rodolasetmessages: + msg84917
2009-03-31 21:27:04alexersetmessages: + msg84916
2009-03-31 21:19:30alexersetnosy: + alexer
messages: + msg84912
2009-02-13 03:27:42ajaksu2linkissue777588 superseder
2008-07-03 18:11:01josiahcarlsonsetstatus: open -> closed
resolution: out of date
messages: + msg69221
2008-02-04 05:47:12janssensetnosy: + janssen
messages: + msg62033
2007-12-13 01:30:40giampaolo.rodolasetnosy: + giampaolo.rodola
2004-03-03 13:07:43klimkincreate