File name |
Uploaded |
Description |
Edit |
xmlrpc.patch
|
kristjan.jonsson,
2008-11-17 14:33
|
|
|
xmlrpc.patch
|
kristjan.jonsson,
2008-11-25 11:13
|
A new version of the patch, taking into account various concerns raised. |
|
xmlrpc.patch
|
kristjan.jonsson,
2008-11-27 16:06
|
|
|
xmlrpc.patch
|
kristjan.jonsson,
2008-11-28 10:53
|
|
|
xmlrpc_1.patch
|
kristjan.jonsson,
2008-12-01 10:12
|
a patch that focuses on the Nagle problem only |
|
httplib.patch
|
kristjan.jonsson,
2008-12-08 15:33
|
a suggested improvement to the endheaders(data) feature |
|
htmllibclients.patch
|
kristjan.jonsson,
2008-12-08 15:34
|
use endheaders(body) in stead of separate conn.send(body) in users of the httplib.py |
|
msg75962 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-17 14:33 |
There are two performance problems in xmlrpclib.py:
POST requests use two send() calls, one to send the headers and one to
send the data. This can invoke the Nagle/Delayed ACK performance
problem. On many socket implementations (including some windows
platforms) it can introduce delays of up to 200ms when talking over the
wire (as opposed to localhost)
The second is the use of no buffering when reading the xmlrpc
response. It is done using the readline() call on a file-like object
constructed from the socket. When no buffering is in effect, this
causes a separate recv() call for each character.
This patch fixes these issues separately:
1) We provide a new getheaderdata() function in httplib which just
returns the data for the headers instead of sending it. This can be
used instead of endheaders() if the intention is to subsequently send
more data. xmlrpclib then uses this method of framing its POST
messages.
2) We provide the named artgument bufsize=0 to the HTTPResponse class
in httplib, and also so on for the HTTPConnection.getresponse() and
HTTP.getreply(). xmlrpclib then passes bufsize=-1 to HTTP.getreply() to
get default buffering for the response fileobject.
This patch focuses on fixing the problems with xmlrpclib. but issue 1)
above also has a number of other manifestations in the lib, there are
other places where we can use getheaderdata() and send() instead of
endheaders() to avoid possible Nagle/Ack problems, e.g. in urllib.py,
distutils.py and loggin/handlers.py
|
msg76174 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-21 10:52 |
Just a thought here:
Maybe it would be better just to change socket._fileobject to always
use a minimum of 8k readbuffer for readline() just as read() already
does. The read() documentation states that recv(1) is very
inefficient, and so it is.
The intent, when calling makefile(bufsize=0) is probably to make sure
that buffering for write() is disabled.
Any thoughts?
|
msg76343 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-11-24 17:32 |
This patch makes sense in principle, but some of the details need to
change. The _send_output() method is used by some clients, merely
because it can be used :-(. It's fairly easy to preserve this API for
backwards compatibility.
I am also worried about this new api call getheaderdata(). It
complicates the API. Even if it were necessary, it needs a better name,
because getheaderdata() doesn't sound like a method that changes the
connection state or consumes buffered header data.
I think it would be better to have the new behavior exposed only through
HTTPConnection and not HTTP, since that's a Python 1.5.2 compatibility
API(!). We can make some small changes to xmlrpclib to use the newer
API. It would probably be a good change merely because py3k now uses
the newer API.
I've got a working local change, but it's still a little ugly.
|
msg76346 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-11-24 17:47 |
Just wanted to mention that the best solution is to update as much code
as possible to use HTTPConnection instead of HTTP. I'm not sure how
easy it is to do for xmlrpclib, since it exposes methods like
send_content(). I guess we can preserve those APIs somehow, but it
won't be pretty.
|
msg76349 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-11-24 17:59 |
Actually, the patch doesn't work :-). The httplib API allows you to
pass a file or a string for the body of the request. The
getheaderdata() + body case only works for the string. Easy to fix,
but I worry that he still run into Nagle problems with the file case,
which was presumably added for efficiency.
Jeremy
On Mon, Nov 24, 2008 at 12:32 PM, Jeremy Hylton <report@bugs.python.org> wrote:
>
> Jeremy Hylton <jeremy@alum.mit.edu> added the comment:
>
> This patch makes sense in principle, but some of the details need to
> change. The _send_output() method is used by some clients, merely
> because it can be used :-(. It's fairly easy to preserve this API for
> backwards compatibility.
>
> I am also worried about this new api call getheaderdata(). It
> complicates the API. Even if it were necessary, it needs a better name,
> because getheaderdata() doesn't sound like a method that changes the
> connection state or consumes buffered header data.
>
> I think it would be better to have the new behavior exposed only through
> HTTPConnection and not HTTP, since that's a Python 1.5.2 compatibility
> API(!). We can make some small changes to xmlrpclib to use the newer
> API. It would probably be a good change merely because py3k now uses
> the newer API.
>
> I've got a working local change, but it's still a little ugly.
>
> ----------
> assignee: -> jhylton
> nosy: +jhylton
> status: open -> pending
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4336>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/jeremy%40alum.mit.edu
>
>
|
msg76406 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-25 11:13 |
I have addressed some issues mentioned:
1) I have retained the _send_output() method.
2) the endheaders() method now takes an optional argument, send_data
that defaults to True. It also returns any unsent data as a string.
This simplifies any code wishing to send more data.
3) The old HTTP class needs no changes anymore.
4) I've fixed the test_xmlrpc.py test case to run all the tests on
windows. The old concern with "WSAEWOULDBLOCK" seems to be no longer
valid.
5) concatenating the result from endheaders() with the request body is
valid, in xmlrpclib, because the assumption has already been made in
send_content() that request_body is a string (str(len(request_body)).
However, in httplib's request() method, we now special case this. I
don't want to complicate the code for what appears to be a rare case.
I chose not to mess with xmlrpclib and make it continue to use the old
HTTP class in order to minimise the extent of this patch. A simple and
clear performance patch has in my experience a much greater chance of
finding its way into the codebase than a more extensive rewrite :)
You may have concerns regarding point 3 above, and I am open to
suggestions.
Now, what remains is the question of the read buffering for the socket
fileobject. Do you think that it is feasible to simply change the
default read buffering for fileobjects to use buffering for readline()
same as they do for read()? Then we wouldn't need the second half of
this patch and we could make a separate "socket" performance patch.
|
msg76407 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-25 11:15 |
Sorry, I meant : "you may have concerns regarding point 2) above"
|
msg76489 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-11-26 21:53 |
I think we're making progress, but I'm still not sure about the new
httplib api. My current worry is that endheaders() behaves very
differently when send_data is false. My chief concern is that the
__state variable is going to indicate that the request has been sent
when we're really depending on the client to send the header. In
general, I don't like the public send() method since it's exposing a
raw socket that could be called at any time regardless of the _state
of the object.
What if endheaders() takes body as an optional argument and sends it
along with the headers? This accomplishes the same basic goal as
returning the header from endheaders(), but keeps the actual sending
of data encapsulated within the http connection.
Jeremy
On Tue, Nov 25, 2008 at 6:13 AM, Kristján Valur Jónsson
<report@bugs.python.org> wrote:
>
> Kristján Valur Jónsson <kristjan@ccpgames.com> added the comment:
>
> I have addressed some issues mentioned:
> 1) I have retained the _send_output() method.
> 2) the endheaders() method now takes an optional argument, send_data
> that defaults to True. It also returns any unsent data as a string.
> This simplifies any code wishing to send more data.
> 3) The old HTTP class needs no changes anymore.
> 4) I've fixed the test_xmlrpc.py test case to run all the tests on
> windows. The old concern with "WSAEWOULDBLOCK" seems to be no longer
> valid.
> 5) concatenating the result from endheaders() with the request body is
> valid, in xmlrpclib, because the assumption has already been made in
> send_content() that request_body is a string (str(len(request_body)).
> However, in httplib's request() method, we now special case this. I
> don't want to complicate the code for what appears to be a rare case.
>
> I chose not to mess with xmlrpclib and make it continue to use the old
> HTTP class in order to minimise the extent of this patch. A simple and
> clear performance patch has in my experience a much greater chance of
> finding its way into the codebase than a more extensive rewrite :)
>
> You may have concerns regarding point 3 above, and I am open to
> suggestions.
>
> Now, what remains is the question of the read buffering for the socket
> fileobject. Do you think that it is feasible to simply change the
> default read buffering for fileobjects to use buffering for readline()
> same as they do for read()? Then we wouldn't need the second half of
> this patch and we could make a separate "socket" performance patch.
>
> Added file: http://bugs.python.org/file12127/xmlrpc.patch
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4336>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/jeremy%40alum.mit.edu
>
>
|
msg76502 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-27 16:06 |
I like the suggestion of having endheaders() accept any extra data.
I have uploaded a new patch which implements this idea. It simplifies
things a lot.
The issue with the read buffer size is still open. I have sent an
email to python-dev requesting comments.
|
msg76523 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-11-28 10:53 |
Guido pointed out the problem with _fileobject.readline() being
followed by socket.recv() if readline uses read buffering.
xmlrpclib.py was attempting to directly use the underlying socket,
although in actual fact it never did, (since HTTPConnectio had closed
it when it returned the response from getresponse()) Never the less, it
is prudent to make sure that we don't attempt this.
There really should be no need to use the socket directly, a buffered
read() call is just as efficient.
|
msg76578 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-11-29 01:52 |
I did the simple part of the patch, where the request and headers are
sent at the same time. The applied patch didn't pass the test suite,
and I want to think about the buffering change a bit more. It's
definitely tricky.
Jeremy
On Fri, Nov 28, 2008 at 5:53 AM, Kristján Valur Jónsson
<report@bugs.python.org> wrote:
>
> Kristján Valur Jónsson <kristjan@ccpgames.com> added the comment:
>
> Guido pointed out the problem with _fileobject.readline() being
> followed by socket.recv() if readline uses read buffering.
> xmlrpclib.py was attempting to directly use the underlying socket,
> although in actual fact it never did, (since HTTPConnectio had closed
> it when it returned the response from getresponse()) Never the less, it
> is prudent to make sure that we don't attempt this.
> There really should be no need to use the socket directly, a buffered
> read() call is just as efficient.
>
> Added file: http://bugs.python.org/file12145/xmlrpc.patch
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue4336>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/jeremy%40alum.mit.edu
>
>
|
msg76682 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-12-01 10:12 |
Really?
Works for me. Here is a simple patch where we leave the read buffering.
passes test_xmlrpclib.py and test_httplib.py
|
msg76697 - (view) |
Author: Jeremy Hylton (jhylton)  |
Date: 2008-12-01 15:26 |
I submitted r67442, which combines the headers and body in a single
send() call. We should look at the buffering issue now, although I
probably won't have any time to check on it until Friday.
|
msg77024 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-12-05 15:06 |
I think it would have been better to have endheaders() (and then perhaps
_send_output()) deal with the non-string (i.e. filebuffer) case, so
that endheaders(body) is semantically equivalent to endheaders();
send(body). The version you checked in makes it necessary that the
user of HTTPConnection.endheaders() is aware of the distinction.
|
msg77309 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-12-08 15:33 |
Please consider applying this patch. It moves the logic handling of
non-string message_body inside of endheaders(), allowind clients of
HTTPConnection to substitute endheader();send(data) with
endheaders(data) without fear.
|
msg77310 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2008-12-08 15:34 |
In light of my previous patch, here is a an improvement to urllib.py and
logging that uses endheaders(body)
|
msg79407 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-08 10:49 |
Reopening this, since I am not entirely happy with the implementation
that was checked in. Please see my comments from Dec 5th onwards.
|
msg79502 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-09 20:27 |
Checked in revision: 68458 and
revision: 68459
|
msg80153 - (view) |
Author: Kristján Valur Jónsson (kristjan.jonsson) *  |
Date: 2009-01-19 10:29 |
note, this has been ported to Py3k in http://svn.python.org/view?
view=rev&rev=68458
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:41 | admin | set | github: 48586 |
2009-01-19 10:29:43 | kristjan.jonsson | set | keywords:
patch, patch, easy messages:
+ msg80153 |
2009-01-09 20:27:40 | kristjan.jonsson | set | status: open -> closed resolution: accepted messages:
+ msg79502 keywords:
patch, patch, easy |
2009-01-08 10:49:01 | kristjan.jonsson | set | status: pending -> open keywords:
patch, patch, easy messages:
+ msg79407 |
2008-12-08 15:34:59 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ htmllibclients.patch messages:
+ msg77310 |
2008-12-08 15:33:56 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ httplib.patch messages:
+ msg77309 |
2008-12-05 15:06:39 | kristjan.jonsson | set | keywords:
patch, patch, easy messages:
+ msg77024 |
2008-12-01 15:26:20 | jhylton | set | keywords:
patch, patch, easy messages:
+ msg76697 |
2008-12-01 10:32:02 | hodgestar | set | nosy:
+ hodgestar |
2008-12-01 10:13:03 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ xmlrpc_1.patch messages:
+ msg76682 |
2008-11-29 01:52:25 | jhylton | set | messages:
+ msg76578 |
2008-11-28 10:53:26 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ xmlrpc.patch messages:
+ msg76523 |
2008-11-27 16:06:56 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ xmlrpc.patch messages:
+ msg76502 |
2008-11-26 21:53:58 | jhylton | set | messages:
+ msg76489 |
2008-11-25 11:15:33 | kristjan.jonsson | set | keywords:
patch, patch, easy messages:
+ msg76407 |
2008-11-25 11:13:39 | kristjan.jonsson | set | keywords:
patch, patch, easy files:
+ xmlrpc.patch messages:
+ msg76406 |
2008-11-24 18:02:56 | vstinner | set | keywords:
patch, patch, easy nosy:
+ vstinner |
2008-11-24 17:59:11 | jhylton | set | messages:
+ msg76349 |
2008-11-24 17:47:28 | jhylton | set | keywords:
patch, patch, easy messages:
+ msg76346 |
2008-11-24 17:32:14 | jhylton | set | status: open -> pending assignee: jhylton messages:
+ msg76343 keywords:
patch, patch, easy nosy:
+ jhylton |
2008-11-21 10:52:53 | kristjan.jonsson | set | keywords:
patch, patch, easy messages:
+ msg76174 |
2008-11-17 14:33:40 | kristjan.jonsson | create | |