classification
Title: Change blocksize in http.client to the value of resource.getpagesize
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder: Configurable blocksize in HTTP(S)Connection
View: 31945
Assigned To: Nosy List: Claudiu.Popa, Clay Gerrard, demian.brecht, martin.panter, vstinner
Priority: normal Keywords: patch

Created on 2014-06-17 15:40 by demian.brecht, last changed 2017-11-08 16:17 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
set_blocksize_to_getpagesize.diff demian.brecht, 2014-06-17 15:40 review
issue21790.patch demian.brecht, 2014-06-20 14:49
Messages (8)
msg220839 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-06-17 15:40
When sending data, the blocksize is currently hardcoded to 8192. It should likely be set to the value of resource.getpagesize().
msg220887 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-06-17 20:47
resource module is available only on Unix.
msg221093 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-06-20 14:49
Updated to mmap.PAGESIZE, which seems to be available on all systems.
msg221095 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-06-20 15:03
> When sending data, the blocksize is currently hardcoded to 8192.

Yeah, same value for io.DEFAULT_BUFFER_SIZE.

> It should likely be set to the value of resource.getpagesize().

Could you please elaborate? A page size is 4096 bytes on my Linux. So your page double the number of calls to read(), right?

shutil.copyfileobj() uses a buffer of 16 KB by default.

See also the issue #21679 which adds a private copy of the stat().st_blksize attribute in a FileIO object.
msg221120 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-06-20 19:56
I very well could be missing something here (and, admittedly, my OS knowledge is rusty at best), but for the most part, page sizes are 4096, except for SPARC, which is 8192.

> So your page double the number of calls to read(), right?

No. read() is called until EOF. I'm assuming that 8192 may have been used to accommodate worst case architecture?

I'd have to dig through the C side of things (which I haven't done yet) to see what's going on down at that level, but my assumption is that it's allocating 8192 bytes for each read. Should EOF be reached with <= 1 page filled, it'd result in a wasted page.

Far from the end of the world, but just something I noticed in passing.
msg292688 - (view) Author: Clay Gerrard (Clay Gerrard) Date: 2017-05-01 18:13
I don't think the default is bad/wrong so much as it's an inflexible magic number.

I think blocksize should be parameterized as an attribute on the connection - or at least a class attribute or module level constant so we can monkey patch it without having to create a subclass and redefine the *entire* send method (!?).

Not having a good string to pull at here is causing a non-trivial amount of pain:

https://github.com/sigmavirus24/requests-toolbelt/pull/84/files
https://github.com/sigmavirus24/requests-toolbelt/issues/75
https://webcache.googleusercontent.com/search?q=cache:WPSngjyUzg4J:https://technology.jana.com/2015/12/30/dont-use-python-httplib-or-any-library-that-uses-it-to-transfer-large-files-especially-to-china/+&cd=5&hl=en&ct=clnk&gl=us
https://bugs.launchpad.net/python-swiftclient/+bug/1671621
msg292708 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-05-02 01:26
For plain-text (non-SSL) HTTP uploads, perhaps using “socket.sendfile” would help: Issue 13559.

Another idea would be to expose and document the low-level socket (or SSL wrapper) and let the user copy data with whatever chunk size they desire.
msg305612 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-11-05 23:28
Issue 31945 proposes adding a “blocksize” parameter to HTTPConnection objects, so I suggest to closing in favour of that one.
History
Date User Action Args
2017-11-08 16:17:02berker.peksagsetstatus: open -> closed
stage: resolved
2017-11-05 23:28:20martin.pantersetsuperseder: Configurable blocksize in HTTP(S)Connection
resolution: rejected
messages: + msg305612
2017-05-02 01:26:19martin.pantersetnosy: + martin.panter
messages: + msg292708
2017-05-01 18:13:15Clay Gerrardsetnosy: + Clay Gerrard
messages: + msg292688
2014-06-20 19:56:49demian.brechtsetmessages: + msg221120
2014-06-20 15:03:11vstinnersetnosy: + vstinner
messages: + msg221095
2014-06-20 14:49:06demian.brechtsetfiles: + issue21790.patch

messages: + msg221093
2014-06-17 20:47:43Claudiu.Popasetnosy: + Claudiu.Popa
messages: + msg220887
2014-06-17 15:40:24demian.brechtcreate