Title: Why does http.client.HTTPResponse._safe_read use MAXAMOUNT
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.8
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: bmerry, inada.naoki, martin.panter
Priority: normal Keywords: patch

Created on 2019-02-20 12:55 by bmerry, last changed 2019-04-06 09:06 by inada.naoki. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 12698 merged inada.naoki, 2019-04-05 12:32
Messages (5)
msg336081 - (view) Author: Bruce Merry (bmerry) * Date: 2019-02-20 12:55
While investigating poor HTTP read performance I discovered that reading all the data from a response with a content-length goes via _safe_read, which in turn reads in chunks of at most MAXAMOUNT (1MB) before stitching them together with b"".join. This can really hurt performance for responses larger than MAXAMOUNT, because
(a) the data has to be copied an additional time; and
(b) the join operation doesn't drop the GIL, so this limits multi-threaded scaling.

I'm struggling to see any advantage in doing this chunking - it's not saving memory either (in fact it is wasting it).

To give an idea of the performance impact, changing MAXAMOUNT to a very large value made a multithreaded test of mine go from 800MB/s to 2.5GB/s (which is limited by the network speed).
msg336498 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019-02-25 05:15
The 1 MiB limit was added for Issue 1296004; apparently some platforms were overallocating multiple buffers and running out of memory. I suspect the loop in "_safe_read" was inherited from Python 2, which has different kinds of file objects. The comments suggest it does partial reads.

But the Python 3 code calls "socket.makefile" with "buffering" mode enabled by default. I agree it should be safe to read the total length in one go.
msg339496 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-05 11:57
issue1296004 is too old (512MB RAM machine!) and I can not confirm it.

But I think it was caused by inefficient realloc() in CRT.


_fileobject called socket.recv with remaining size.
Typically, socket can't return MBs at once.  So it cause:

1. Large (at most `amt`, some MBs) string (bytes) are allocated. (malloc)
2. recv is called.
3. _PyString_Resize() (realloc) is called with smaller bytes (typically ~128KiB)
4. amt -= received
5. if amt == 0: exit; goto 1.

This might stress malloc and realloc in CRT.  It caused fragmentation and MemoryError.


For now, almost everything is rewritten.

In case of _pyio, BufferedIOReader calls  It copies from bytearray to bytes.  So call only malloc and free.  Stress for realloc will be reduced.

In case of C _io module, it is more efficient.  It allocate PyBytes once and calls SocketIO.read_into directly.  No temporary bytes objects are created.
msg339498 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-05 12:12
Additionally, _safe_read calls multiple times to handle EINTR.
But EINTR is handled by socket module now (PEP 475).

Now the function can be very simple.
msg339532 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-06 09:06
New changeset d6bf6f2d0c83f0c64ce86e7b9340278627798090 by Inada Naoki in branch 'master':
bpo-36050: optimize (GH-12698)
Date User Action Args
2019-04-06 09:06:31inada.naokisetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-04-06 09:06:21inada.naokisetmessages: + msg339532
2019-04-05 12:32:44inada.naokisetkeywords: + patch
stage: patch review
pull_requests: + pull_request12623
2019-04-05 12:12:46inada.naokisetmessages: + msg339498
2019-04-05 11:57:37inada.naokisettype: performance
messages: + msg339496
versions: + Python 3.8, - Python 3.7
2019-04-05 10:51:14inada.naokisetnosy: + inada.naoki
2019-02-25 05:15:49martin.pantersetnosy: + martin.panter
messages: + msg336498
2019-02-20 12:55:43bmerrycreate