This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: http.client delayed ack / Nagle algorithm optimisation performs badly for large messages
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: bennoleslie, jcea, neologix, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2013-01-01 09:36 by bennoleslie, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
http_opt.diff bennoleslie, 2013-01-01 09:36 Proposed fix based on python-dev discussion review
http_opt_v2.diff bennoleslie, 2013-01-01 23:07 review
Messages (9)
msg178728 - (view) Author: Benno Leslie (bennoleslie) * Date: 2013-01-01 09:36
he http.client HTTPConnection._send_output method has an optimization for avoiding bad interactions between delayed-ack and the Nagle algorithm:

http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884

Unfortunately this interacts rather poorly if the case where the message_body is a bytes instance and is rather large.

If the message_body is bytes it is appended to the headers, which causes a copy of the data. When message_body is large this duplication of data can cause a significant spike in memory usage.

(In my particular case I was uploading a 200MB file to 30 hosts at the same leading to memory spikes over 6GB.

[There is a short thread discussing this issue on python-dev; Subject: "http.client Nagle/delayed-ack optimization"; Date: Dec 15, 2012]
msg178731 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-01 11:53
Thanks for the patch. Perhaps our MSS value should be an upper bound of common values? Apparently for a localhost connection TCP_MAXSEG gives 16384 here (but I don't know if the http.client optimization is important for localhost connections).

Also, it would be nice to add an unit test in Lib/test/test_httplib.py.
msg178733 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-01-01 15:18
> Thanks for the patch. Perhaps our MSS value should be an upper bound of common values? Apparently for a localhost connection TCP_MAXSEG gives 16384 here (but I don't know if the http.client optimization is important for localhost connections).

According to http://en.wikipedia.org/wiki/Maximum_transmission_unit,
the MTU (from which the MSS is derived) for a wireless NIC is around
8000, and you can have jumbo frames on Ethernet up to 9000, so a value
of 2K might be too small (since the goal is to avoid sending a payload
less than MSS with its own send() syscall).
So I think a value like 16K should be OK (the downside to using a
large value is the extra copying, but OTOH it will incur less copying
than now).
Note that the ideal fix for this would be scatter-gather I/O with
sendmsg(): both the header and the payload could be sent in the same
segment, without any extra copying. But I'm not sure it's worth it
here.

A note on the comment:
"""
722 # TCP Maximum Segment Size (MSS) is a TCP stack parameter. There
723 # is no simple and efficient platform independent mechanism for
724 # determining the MSS, so instead a reasonable estimate is chosen.
"""

The MSS is not really a TCP stack parameter (i.e. it's not fixed by
the host like socket buffers, etc): it's usually negociated between
the hosts (and with path MTU discovery or IPv6 it depends on the path
MTU). That's why it's hard to guess ahead of time.
msg178781 - (view) Author: Benno Leslie (bennoleslie) * Date: 2013-01-01 23:07
I've updated the patch based on Charles-François and Antoine's feedback.

Primarily this increase the estimated MSS value to 16KiB.
Additionally tests are added and comments improved.

Thanks for the feedback.
msg178852 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-01-02 21:14
New changeset 002bf9857278 by Antoine Pitrou in branch 'default':
Issue #16833: In http.client.HTTPConnection, do not concatenate the request headers and body when the payload exceeds 16 KB, since it can consume more memory for no benefit.
http://hg.python.org/cpython/rev/002bf9857278
msg178853 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-02 21:15
Patch looks good to me, I've committed it. Thank you!
msg178854 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-02 21:16
By the way, could you sign a contributor agreement?
http://www.python.org/psf/contrib/

Thank you.
msg178917 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-03 06:57
Jesus, why are you reopening this issue?
msg178976 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2013-01-03 19:46
Sorry, firefox is playing with me. As usual :). Just writing about i in python-dev.

Thanks for the heads up.
History
Date User Action Args
2022-04-11 14:57:40adminsetgithub: 61037
2013-01-03 19:46:20jceasetstatus: open -> closed
resolution: fixed
messages: + msg178976

stage: patch review -> resolved
2013-01-03 06:57:41pitrousetmessages: + msg178917
2013-01-03 02:04:30jceasetstatus: closed -> open
nosy: + jcea

resolution: fixed -> (no value)
stage: resolved -> patch review
2013-01-02 21:16:03pitrousetmessages: + msg178854
2013-01-02 21:15:32pitrousetstatus: open -> closed
resolution: fixed
messages: + msg178853

stage: patch review -> resolved
2013-01-02 21:14:45python-devsetnosy: + python-dev
messages: + msg178852
2013-01-02 03:56:50Ramchandra Aptesettitle: httpc.lient delayed ack / Nagle algorithm optimisation performs badly for large messages -> http.client delayed ack / Nagle algorithm optimisation performs badly for large messages
2013-01-01 23:07:00bennolesliesetfiles: + http_opt_v2.diff

messages: + msg178781
2013-01-01 15:18:06neologixsetmessages: + msg178733
2013-01-01 11:53:23pitrousetversions: + Python 3.4, - Python 3.3
nosy: + neologix, pitrou

messages: + msg178731

stage: patch review
2013-01-01 09:36:02bennolesliecreate