Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide access to buffer of asyncio.StreamReader #76233

Closed
BruceMerry mannequin opened this issue Nov 16, 2017 · 8 comments
Closed

Provide access to buffer of asyncio.StreamReader #76233

BruceMerry mannequin opened this issue Nov 16, 2017 · 8 comments
Labels
3.8 only security fixes topic-asyncio type-feature A feature request or enhancement

Comments

@BruceMerry
Copy link
Mannequin

BruceMerry mannequin commented Nov 16, 2017

BPO 32052
Nosy @asvetlov, @1st1, @bmerry
Dependencies
  • bpo-32251: Add asyncio.BufferedProtocol
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-06-02.11:09:41.308>
    created_at = <Date 2017-11-16.19:09:48.313>
    labels = ['type-feature', '3.8', 'expert-asyncio']
    title = 'Provide access to buffer of asyncio.StreamReader'
    updated_at = <Date 2019-06-03.09:18:31.912>
    user = 'https://bugs.python.org/BruceMerry'

    bugs.python.org fields:

    activity = <Date 2019-06-03.09:18:31.912>
    actor = 'bmerry'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-06-02.11:09:41.308>
    closer = 'asvetlov'
    components = ['asyncio']
    creation = <Date 2017-11-16.19:09:48.313>
    creator = 'Bruce Merry'
    dependencies = ['32251']
    files = []
    hgrepos = []
    issue_num = 32052
    keywords = []
    message_count = 8.0
    messages = ['306397', '308770', '308772', '327613', '327648', '327657', '344273', '344393']
    nosy_count = 4.0
    nosy_names = ['asvetlov', 'yselivanov', 'Bruce Merry', 'bmerry']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue32052'
    versions = ['Python 3.8']

    @BruceMerry
    Copy link
    Mannequin Author

    BruceMerry mannequin commented Nov 16, 2017

    While asyncio.StreamReader.readuntil is an improvement on only having readline, it is still quite limited e.g. you cannot have multiple possible terminators. The real problem is that it's not possible to roll your own without accessing _underscore fields (other than by reading one byte at a time, which I'm guessing would be bad for performance). I'm not sure exactly what a public API to assist would look like, but I think the following would be a good start:

    1. A get_buffer method, that returns (self._buffer, self._eof); the caller must treat the buffer as readonly.
    2. A wait_for_data method to wait for the return value of get_buffer to change (basically like current _wait_for_data)
    3. Access to the _limit attribute.

    With that available, I think readuntil or more complex variants of it could be implemented externally using only the public interface (consumption of data from the buffer would be via readexactly rather than by messing with the buffer array directly).

    @BruceMerry BruceMerry mannequin added 3.7 (EOL) end of life topic-asyncio type-feature A feature request or enhancement labels Nov 16, 2017
    @asvetlov
    Copy link
    Contributor

    If the problm is in readuntil() functionality -- let's discuss th function improvement (in separate issue).

    Exposing streams internals is antipattern and very bad idea.
    I suggest closing the issue.

    Yury, what is your opinion?

    @1st1
    Copy link
    Member

    1st1 commented Dec 20, 2017

    I'd be more comfortable with the idea of exposing the buffer when we have BufferedProtocol. Let's wait on this one.

    @csabella csabella added 3.8 only security fixes and removed 3.7 (EOL) end of life labels Oct 12, 2018
    @1st1
    Copy link
    Member

    1st1 commented Oct 12, 2018

    So we have BufferedProtocol in 3.7; now we need to re-implement asyncio streams on top of it. But even after doing that I'm not that sure we want to expose the low-level buffer.

    @asvetlov
    Copy link
    Contributor

    Exposing internal buffer means committing on a new API contract forever.

    I feel a need for reacher read*() API but pretty sure that making internal buffer public is a bad idea. With BufferedProtocol it could be even worse: SLAB allocators can spit a buffer into several separate chunks.

    str.startswith() supports a tuple of separators, maybe we can do the same for streaming API

    @BruceMerry
    Copy link
    Mannequin Author

    BruceMerry mannequin commented Oct 13, 2018

    A sequence of possible terminators would cover my immediate use case and certainly be an improvement.

    To facilitate more general use cases without exposing implementation details, would it be practical and maintainable to have a "putback" method that prepends data to the buffer? It might not be fast in all cases (e.g. it might have to make a copy of what's still in the buffer), but possibly BufferedReader could detect the common case (putting back a suffix of what's just been read) and adjust its offsets into its internal buffer (although I'm not at all familiar with BufferedReader, so feel free to tell me I'm talking nonsense).

    @asvetlov
    Copy link
    Contributor

    asvetlov commented Jun 2, 2019

    Closing.
    Brief:

    1. Access to an internal buffer is not an option.
    2. Pushing data back to stream after fetching is not an option too: it kills almost any possible optimization and makes code overcomplicated. aiohttp used to have "unread_data()" API but we deprecated it and going to remove the method entirely. A wrapper around the stream with puback functionality is an option though unless it doesn't touch underlying stream implementation.
    3. Extending a set of reader operations is a good idea, please make a separate issue with a concrete proposal if needed.

    @asvetlov asvetlov closed this as completed Jun 2, 2019
    @bmerry
    Copy link
    Mannequin

    bmerry mannequin commented Jun 3, 2019

    Ok, I'll open a separate issue to allow a tuple of possible separators.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes topic-asyncio type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants