Issue32475
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-01-01 08:31 by kata198, last changed 2022-04-11 14:58 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
Python_3_6_4_getbuffn.patch | kata198, 2018-01-01 08:31 | getbuffn additions to Python 3.6.4 |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 7947 | open | josnyder, 2018-06-27 03:58 |
Messages (4) | |||
---|---|---|---|
msg309328 - (view) | Author: Tim Savannah (kata198) * | Date: 2018-01-01 08:31 | |
Hello! This is my first time submitting to Python bug tracker, so please bear with me if I miss/mess something. So a little bit of relevant background, I'm an avid python developer with many open-source projects. One of the projects I wrote and maintain is called "python-nonblock", which provides pure-python non-blocking I/O methods. It is available at: https://github.com/kata198/python-nonblock I'll only include the relevant details to this topic. So, one of the features provided by the python-nonblock in the nonblock_read function. This allows you to read from a stream whilst ensuring the operation does not block. It achieves this by basically following this pattern: 1. Call "select" on the stream and see if any data is available. If not, sleep and reiterate. 2. If there is data available, it reads a single byte from the stream and stores is to return at the end. It supports most streams and sockets which have a real fd backing (and thus support "select"). There are a couple reasons you may need to do this, e.x. certain interactive scenarios, I won't go into it too much. The python-nonblock library also bundles a layer which sits on top of that method, called BackgroundRead. This interface launches a thread into the background, which reads blocks of arbitrary (user-provided) size into a variable on an object. So you could have a processing app which reads blocks of data from a source, processes them in the foreground whilst they continue to load up in the background. That's all well and good, but we are getting to the meat of the issue: for large sources of available data (like a file on disk), while this method of operation is effective, it is SLOW, due to the overhead of a select syscall and libpython for every single byte. This is especially true on a loaded system, as it makes us a prime candidate for the scheduler to preempt our task and context switch us off the cpu! I've been looking into ways to improve this, and have actually seemed to have struck gold. So on a standard linux HDD filesystem, the I/O block size is 4096. So, thanks to readahead, on a non-fragmented file, a read call for 1 byte will actually load up to 4096 bytes. libpython has this extra data, and calls like read1 will return it if available, but it does not expose this number. Thus, libraries like mine can't take advantage of it, which means that for a filesystem I/O read on linux, 4095 out of 4096 iterations of the two-step loop above are wasted effort. So I've written up an additional function to the C code for BufferedReader, "getbuffn", which returns the number bytes currently buffered in libpython, but not yet returned to the application, and modified python-nonblock ( in the 4.0branch_getbuffn branch ) with simple additions to take advantage of this extra information, when available. So if we detect that there are 4095 bytes already read and pending, we know for certain we can grab all 4095 bytes at once without blocking, or even needing a call to select! So the new pattern for buffered streams that have getbuffn available, we can: 1. Select to see if data is available, if not rest and reiterate 2. Read a single byte off the stream 3. Check getbuffn, and if it returns >0 read that many bytes off the stream (Guaranteed to not block) The performance improvements are * MASSIVE * with this change in place. On a 5 meg file from a VM which is running on an SSD, I average the following: Loaded system, non-cached I/O: One-Fell-Swoop file.read() - .3 seconds getbuffn-patched python and this impl - 3.1 seconds unpatched python and this impl - 41 to 55 = 44 seconds. ( avg min - avg max) Unloaded system, cached I/O: One-Fell-Swoop file.read() - .0017 seconds getbuffn-patched python and this impl - .034 seconds unpatched python and this impl - 45 seconds. ( not as variable as loaded system ) That's a 13,235% (thirteen-thousand two-hundred and five percent) performance boost on just a 5MB file, which just grows exponentially as the size of the dataset increases. These gains are just simply not possible without this information available (the amount remaining in the buffer). So I've attached the simple patch (only additions, no modifications to existing functions) against python 3.6.4. The version of python-nonblock which supports this enhanced approach when available (and more details at the top of the README) can be found here: https://github.com/kata198/python-nonblock/tree/4.0branch_getbuffn I've looked at the python 2.7 code too, and it seems that with minimal effort this same functionality can be provided there as well! So, I'm an experienced developer who is willing to put in the legwork. Is this something that is possible to get merged upstream? If so, what are the steps I would need to take in order to make it so? Thanks in advance, Tim Savannah |
|||
msg309334 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2018-01-01 16:54 | |
Why don't you use BufferedReader.peek()? |
|||
msg310035 - (view) | Author: Martin Panter (martin.panter) * | Date: 2018-01-16 05:02 | |
If I remember the implementation of “peek” right, it may do what you want. But the documentation doesn’t guarantee much about its behaviour; see Issue 5811. Anyway, I agree that a “getbuffn” method (or property) would be nice. (Perhaps with a better name!) But please don’t add it to the abstract APIs like BufferedIOBase. It could break compatibility with third-party implementations, or make the API complicated with little benefit. Just extend the concrete APIs like BufferedReader. Two other use cases where the “peek” implementation won’t help, but “getbuffn” would: 1. Issue 32561: Decide whether a non-blocking “read” call is possible, or if a background read (e.g. of a regular “disk” file) should be started instead. 2. Get the pending unread data before it is lost by calling ”BufferedReader.detach”. |
|||
msg320542 - (view) | Author: Josh Snyder (josnyder) * | Date: 2018-06-27 04:01 | |
I've opened PR #7947 with another approach to this issue. In my use-case, an HTTP client makes a request and uses buffered I/O to parse the response headers. I would like to hand off the response socket to an extension module for use with raw I/O syscalls. The BufferedReader likely already contains some of the response body, which I would like to access before handing the socket off. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:56 | admin | set | github: 76656 |
2018-06-27 04:01:30 | josnyder | set | nosy:
+ josnyder messages: + msg320542 |
2018-06-27 03:58:25 | josnyder | set | stage: patch review pull_requests: + pull_request7556 |
2018-01-16 05:10:02 | martin.panter | link | issue32561 dependencies |
2018-01-16 05:02:59 | martin.panter | set | nosy:
+ martin.panter messages: + msg310035 |
2018-01-01 16:54:33 | pitrou | set | nosy:
+ pitrou messages: + msg309334 |
2018-01-01 08:31:45 | kata198 | create |