Provide access to buffer of asyncio.StreamReader #76233

BruceMerry · 2017-11-16T19:09:48Z

BPO	32052
Nosy	@asvetlov, @1st1, @bmerry
Dependencies	bpo-32251: Add asyncio.BufferedProtocol

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-06-02.11:09:41.308>
created_at = <Date 2017-11-16.19:09:48.313>
labels = ['type-feature', '3.8', 'expert-asyncio']
title = 'Provide access to buffer of asyncio.StreamReader'
updated_at = <Date 2019-06-03.09:18:31.912>
user = 'https://bugs.python.org/BruceMerry'

bugs.python.org fields:

activity = <Date 2019-06-03.09:18:31.912>
actor = 'bmerry'
assignee = 'none'
closed = True
closed_date = <Date 2019-06-02.11:09:41.308>
closer = 'asvetlov'
components = ['asyncio']
creation = <Date 2017-11-16.19:09:48.313>
creator = 'Bruce Merry'
dependencies = ['32251']
files = []
hgrepos = []
issue_num = 32052
keywords = []
message_count = 8.0
messages = ['306397', '308770', '308772', '327613', '327648', '327657', '344273', '344393']
nosy_count = 4.0
nosy_names = ['asvetlov', 'yselivanov', 'Bruce Merry', 'bmerry']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue32052'
versions = ['Python 3.8']

BruceMerry · 2017-11-16T19:09:48Z

While asyncio.StreamReader.readuntil is an improvement on only having readline, it is still quite limited e.g. you cannot have multiple possible terminators. The real problem is that it's not possible to roll your own without accessing _underscore fields (other than by reading one byte at a time, which I'm guessing would be bad for performance). I'm not sure exactly what a public API to assist would look like, but I think the following would be a good start:

A get_buffer method, that returns (self._buffer, self._eof); the caller must treat the buffer as readonly.
A wait_for_data method to wait for the return value of get_buffer to change (basically like current _wait_for_data)
Access to the _limit attribute.

With that available, I think readuntil or more complex variants of it could be implemented externally using only the public interface (consumption of data from the buffer would be via readexactly rather than by messing with the buffer array directly).

asvetlov · 2017-12-20T18:48:17Z

If the problm is in readuntil() functionality -- let's discuss th function improvement (in separate issue).

Exposing streams internals is antipattern and very bad idea.
I suggest closing the issue.

Yury, what is your opinion?

1st1 · 2017-12-20T18:50:32Z

I'd be more comfortable with the idea of exposing the buffer when we have BufferedProtocol. Let's wait on this one.

1st1 · 2018-10-12T21:56:22Z

So we have BufferedProtocol in 3.7; now we need to re-implement asyncio streams on top of it. But even after doing that I'm not that sure we want to expose the low-level buffer.

asvetlov · 2018-10-13T10:21:57Z

Exposing internal buffer means committing on a new API contract forever.

I feel a need for reacher read*() API but pretty sure that making internal buffer public is a bad idea. With BufferedProtocol it could be even worse: SLAB allocators can spit a buffer into several separate chunks.

str.startswith() supports a tuple of separators, maybe we can do the same for streaming API

BruceMerry · 2018-10-13T16:14:00Z

A sequence of possible terminators would cover my immediate use case and certainly be an improvement.

To facilitate more general use cases without exposing implementation details, would it be practical and maintainable to have a "putback" method that prepends data to the buffer? It might not be fast in all cases (e.g. it might have to make a copy of what's still in the buffer), but possibly BufferedReader could detect the common case (putting back a suffix of what's just been read) and adjust its offsets into its internal buffer (although I'm not at all familiar with BufferedReader, so feel free to tell me I'm talking nonsense).

asvetlov · 2019-06-02T11:09:41Z

Closing.
Brief:

Access to an internal buffer is not an option.
Pushing data back to stream after fetching is not an option too: it kills almost any possible optimization and makes code overcomplicated. aiohttp used to have "unread_data()" API but we deprecated it and going to remove the method entirely. A wrapper around the stream with puback functionality is an option though unless it doesn't touch underlying stream implementation.
Extending a set of reader operations is a good idea, please make a separate issue with a concrete proposal if needed.

bmerry · 2019-06-03T09:18:32Z

Ok, I'll open a separate issue to allow a tuple of possible separators.

BruceMerry mannequin added 3.7 (EOL) end of life topic-asyncio type-feature A feature request or enhancement labels Nov 16, 2017

csabella added 3.8 only security fixes and removed 3.7 (EOL) end of life labels Oct 12, 2018

asvetlov closed this as completed Jun 2, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide access to buffer of asyncio.StreamReader #76233

Provide access to buffer of asyncio.StreamReader #76233

BruceMerry mannequin commented Nov 16, 2017

BruceMerry mannequin commented Nov 16, 2017

asvetlov commented Dec 20, 2017

1st1 commented Dec 20, 2017

1st1 commented Oct 12, 2018

asvetlov commented Oct 13, 2018

BruceMerry mannequin commented Oct 13, 2018

asvetlov commented Jun 2, 2019

bmerry mannequin commented Jun 3, 2019

Provide access to buffer of asyncio.StreamReader #76233

Provide access to buffer of asyncio.StreamReader #76233

Comments

BruceMerry mannequin commented Nov 16, 2017

BruceMerry mannequin commented Nov 16, 2017

asvetlov commented Dec 20, 2017

1st1 commented Dec 20, 2017

1st1 commented Oct 12, 2018

asvetlov commented Oct 13, 2018

BruceMerry mannequin commented Oct 13, 2018

asvetlov commented Jun 2, 2019

bmerry mannequin commented Jun 3, 2019