This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author bmerry
Recipients bmerry
Date 2020-06-17.12:08:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1592395696.11.0.156333341881.issue41002@roundup.psfhosted.org>
In-reply-to
Content
I've run into this on 3.8, but the code on Git master doesn't look significantly different so I assume it still applies. I'm happy to work on a PR for this.

When http.client.HTTPResponse.read is called with a specific amount to read, it goes down this code path:
```
if amt is not None:
    # Amount is given, implement using readinto
    b = bytearray(amt)
    n = self.readinto(b)
    return memoryview(b)[:n].tobytes()
```
That's pretty inefficient, because
- `bytearray(amt)` will first zero-fill some memory
- `tobytes()` will make an extra copy of this memory
- if amt is big enough, it'll cause the temporary memory to be allocated from the kernel, which will *also* zero-fill the pages for security.

A better approach would be to use the read method of the underlying fp.

I have a micro-benchmark (that I'll attach) showing that for a 1GB body and reading the whole body with or without the amount being explicit, performance is reduced from 3GB/s to 1GB/s.

For some unknown reason the requests library likes to read the body in 10KB chunks even if the user has requested the entire body, so this will help here (although the gains probably won't be as big because 10KB is really too small to amortise all the accounting overhead).

Output from my benchmark, run against a 1GB file on localhost:

httpclient-read: 3019.0 ± 63.8 MB/s
httpclient-read-length: 1050.3 ± 4.8 MB/s
httpclient-read-raw: 3150.3 ± 5.3 MB/s
socket-read: 3134.4 ± 7.9 MB/s
History
Date User Action Args
2020-06-17 12:08:16bmerrysetrecipients: + bmerry
2020-06-17 12:08:16bmerrysetmessageid: <1592395696.11.0.156333341881.issue41002@roundup.psfhosted.org>
2020-06-17 12:08:15bmerrylinkissue41002 messages
2020-06-17 12:08:15bmerrycreate