This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Use sendfile where possible in httplib
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: 17552 23740 Superseder:
Assigned To: Nosy List: Alex.Willmer, benjamin.peterson, christian.heimes, eric.araujo, giampaolo.rodola, kasun, martin.panter, orsenthil, rosslagerwall
Priority: normal Keywords: patch

Created on 2011-12-08 20:47 by benjamin.peterson, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
httplib-sendfile.patch giampaolo.rodola, 2014-06-11 12:57 review
Messages (13)
msg149052 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-12-08 20:47
HTTPConnection.send() should use os.sendfile when possible to avoid copying data into userspace and back.
msg149073 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-12-09 04:52
This is not possible for two reasons:

- on most POSIX systems, sendfile() works with mmap-like ("regular") files only, while HTTPConnection.send() accepts any file-like object as long as it provides a read() method

- after read()ing a chunk of data from the file and before send()ing it over the socket, the data can be subject to an intermediate conversion (datablock.encode("iso-8859-1")):
http://hg.python.org/cpython/file/87c6be1e393a/Lib/http/client.py#l839
...whereas sendfile() can only be used to send a binary file "as-is"

I think we can use sendfile() in ftplib.py though .
I'll open a ticket for that.
msg149075 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-12-09 06:08
2011/12/8 Giampaolo Rodola' <report@bugs.python.org>:
>
> Giampaolo Rodola' <g.rodola@gmail.com> added the comment:
>
> This is not possible for two reasons:
>
> - on most POSIX systems, sendfile() works with mmap-like ("regular") files only, while HTTPConnection.send() accepts any file-like object as long as it provides a read() method
>
> - after read()ing a chunk of data from the file and before send()ing it over the socket, the data can be subject to an intermediate conversion (datablock.encode("iso-8859-1")):
> http://hg.python.org/cpython/file/87c6be1e393a/Lib/http/client.py#l839
> ...whereas sendfile() can only be used to send a binary file "as-is"

I presume you could check for a binary mode, though? Also, you can
catch EINVAl on invalid fds.
msg149077 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-12-09 07:00
ftplib's sendfile support is not tracked as issue13559.
Considerations I made there should apply here as well.
msg149078 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2011-12-09 07:01
Ops! I meant issue13564.
msg220262 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2014-06-11 12:57
Patch in attachment uses the newly added socket.sendfile() method (issue 17552).
msg250435 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-11 02:13
The multiple personalities of HTTPConnection.send() and friends is a bit of a can of worms. I suggest working on Issue 23740 to get an idea of what kinds of file objects are meant to be supported, and what things may work by accident and be used in the real world.

For instance, is it possible to manually set Content-Length, and say supply a GzipFile reader, or file object positioned halfway through the file? How does this interact with the socket.sendfile() call?
msg348634 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-29 11:46
This issue is no newcomer friendly, I remove the "easy" keyword.
msg387699 - (view) Author: Alex Willmer (Alex.Willmer) * Date: 2021-02-26 01:40
I would like to take a stab at this. Giampaolo, would it be okay if I made a pull request updated from your patch? With the appropriate "Co-authored-by: Author Name <email_address>" line.
msg387729 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-02-26 16:04
Alex, https://bugs.python.org/issue23740 is identified as a dependency on this issue. We will have to resolve that first, and come back to this. And yes, if you contribute on other's patch, both the contributions will be included and appropriately credited.
msg387751 - (view) Author: Alex Willmer (Alex.Willmer) * Date: 2021-02-26 22:37
To check my understanding

Is the motivation for the closer to

1. using sendfile() will break $X, and we know X
2. there's high probability sendfile() will break something
3. there's unknown probability sendfile() will break something
4. there's low probability sendfile() will break something, but it is still too high
5. any non-trivial change here is too risky, regardless of sendfile()
6. something else?

My guess is 5.
msg387754 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-02-26 23:27
Yes, the point number 5. We will have to evaluate if sendfile side-steps and avoids the issues noted in issue23740
msg388346 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-03-09 10:40
sendfile() only works for plain HTTP. For technical reasons it does not work for HTTPS (*). These days majority of services use HTTPS. Therefore the usefulness of sendfile() patch is minimal.

(*) It is possible to use sendfile() for TLS connections, but the feature requires a Kernel module that provides kTLS offloading feature, https://www.kernel.org/doc/html/latest/networking/tls-offload.html . In user space it requires OpenSSL 3.0.0 with kTLS support. 3.0.0 is currently under development.
History
Date User Action Args
2022-04-11 14:57:24adminsetgithub: 57768
2021-03-09 10:40:33christian.heimessetnosy: + christian.heimes

messages: + msg388346
versions: + Python 3.10, - Python 3.5
2021-03-08 20:10:04vstinnersetnosy: - vstinner
2021-02-26 23:27:53orsenthilsetmessages: + msg387754
2021-02-26 22:37:45Alex.Willmersetmessages: + msg387751
2021-02-26 16:04:01orsenthilsetmessages: + msg387729
2021-02-26 01:40:33Alex.Willmersetnosy: + Alex.Willmer
messages: + msg387699
2019-07-29 11:46:27vstinnersetkeywords: - easy
nosy: + vstinner
messages: + msg348634

2015-09-11 02:13:11martin.pantersetnosy: + martin.panter
messages: + msg250435

dependencies: + http.client request and send method have some datatype issues
stage: needs patch -> patch review
2014-06-11 12:57:54giampaolo.rodolasetfiles: + httplib-sendfile.patch
type: enhancement -> performance
messages: + msg220262

keywords: + patch
2014-04-28 10:08:25giampaolo.rodolasetdependencies: + Add a new socket.sendfile() method
versions: + Python 3.5, - Python 3.3
2011-12-11 17:20:05kasunsetnosy: + kasun
2011-12-10 16:33:15eric.araujosetnosy: + eric.araujo
2011-12-10 13:33:00rosslagerwallsetnosy: + rosslagerwall
2011-12-09 07:01:52giampaolo.rodolasetmessages: + msg149078
2011-12-09 07:00:54giampaolo.rodolasetmessages: + msg149077
2011-12-09 06:08:50benjamin.petersonsetmessages: + msg149075
2011-12-09 04:52:30giampaolo.rodolasetmessages: + msg149073
2011-12-08 20:48:32pitrousetnosy: + orsenthil, giampaolo.rodola
2011-12-08 20:47:25benjamin.petersoncreate