classification
Title: Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
Type: performance Stage: patch review
Components: IO Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, inada.naoki, martin.panter, serhiy.storchaka, stutzbach, tzickel
Priority: normal Keywords: patch

Created on 2018-11-10 17:23 by tzickel, last changed 2019-04-08 13:00 by inada.naoki.

Pull Requests
URL Status Linked Edit
PR 10451 open tzickel, 2018-11-10 17:27
Messages (8)
msg329629 - (view) Author: (tzickel) * Date: 2018-11-10 17:23
There was a TODO in the code about this:

https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Modules/_io/iobase.c#L909
msg329630 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-10 17:36
What if readinto() save a reference to its argument?
msg329631 - (view) Author: (tzickel) * Date: 2018-11-10 17:43
How is that different from the situation today ? The bytearray passed to readinto() is deleted before the function ends.

This revision simply changes 2 mallocs and a memcpy to 1 malloc and a potential realloc.
msg329632 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-10 17:58
When you remove the original reference, the second reference keeps the bytearray object live. When you remove or just resize the bytes object, the memoryview object becomes referring to freed memory.
msg329638 - (view) Author: (tzickel) * Date: 2018-11-10 19:22
I think that if someone tries that this code will raise an exception at the resize part (since the reference will be higher than one), a check can be added and in this case fallback to the previous behaviour, If it's a required check, I can add it.
msg329646 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018-11-10 22:29
Looks like this is about about making “RawIOBase.read” delegate to “readinto” with a “bytes” object. If so, there’s more discussion in Issue 15903.
msg329668 - (view) Author: (tzickel) * Date: 2018-11-11 07:42
ahh, very interesting discussion. BTW, how is this code different than

https://github.com/python/cpython/blame/50ff02b43145f33f8e28ffbfcc6a9d15c4749a64/Modules/_io/bufferedio.c

which does the same thing exactly ? (i.e. the memoryview can leak there as well).
msg339633 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-08 13:00
Maybe, we need C version of memoryview.release() to invalidate pointer in memoryview object.
History
Date User Action Args
2019-04-08 13:00:44inada.naokisetnosy: + inada.naoki
messages: + msg339633
2018-11-11 07:42:53tzickelsetmessages: + msg329668
2018-11-10 22:29:38martin.pantersetnosy: + martin.panter
messages: + msg329646
2018-11-10 19:22:10tzickelsetmessages: + msg329638
2018-11-10 17:58:37serhiy.storchakasetmessages: + msg329632
2018-11-10 17:43:34tzickelsetmessages: + msg329631
2018-11-10 17:36:38serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg329630
2018-11-10 17:30:15tzickelsetnosy: + benjamin.peterson, stutzbach
2018-11-10 17:27:23tzickelsetkeywords: + patch
stage: patch review
pull_requests: + pull_request9726
2018-11-10 17:23:52tzickelcreate