Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.BufferedReader:peek() closes underlying file, breaking peek/read expectations #88853

Closed
liquidpele mannequin opened this issue Jul 20, 2021 · 9 comments
Closed

io.BufferedReader:peek() closes underlying file, breaking peek/read expectations #88853

liquidpele mannequin opened this issue Jul 20, 2021 · 9 comments
Labels
3.9 only security fixes topic-IO

Comments

@liquidpele
Copy link
Mannequin

liquidpele mannequin commented Jul 20, 2021

BPO 44687
Nosy @zooba, @miss-islington, @AngstyDuck, @liquidpele
PRs
  • bpo-44687: Fix unintended closing file error when peeking a BufferedReader object #28457
  • [3.10] bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457) #28685
  • [3.9] bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457) #28686
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-10-01.20:46:58.225>
    created_at = <Date 2021-07-20.20:42:34.702>
    labels = ['3.9', 'expert-IO']
    title = 'io.BufferedReader:peek() closes underlying file, breaking peek/read expectations'
    updated_at = <Date 2021-10-01.20:46:58.225>
    user = 'https://github.com/liquidpele'

    bugs.python.org fields:

    activity = <Date 2021-10-01.20:46:58.225>
    actor = 'steve.dower'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-10-01.20:46:58.225>
    closer = 'steve.dower'
    components = ['IO']
    creation = <Date 2021-07-20.20:42:34.702>
    creator = 'liquidpele'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44687
    keywords = ['patch']
    message_count = 9.0
    messages = ['397907', '402512', '402581', '402591', '402604', '403028', '403029', '403032', '403034']
    nosy_count = 4.0
    nosy_names = ['steve.dower', 'miss-islington', 'AngstyDuck', 'liquidpele']
    pr_nums = ['28457', '28685', '28686']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue44687'
    versions = ['Python 3.9']

    @liquidpele
    Copy link
    Mannequin Author

    liquidpele mannequin commented Jul 20, 2021

    reecepeg@3c22fb7d6b37 Pulpmill % python3 --version
    Python 3.9.1

    When buffering a small file, calling peek() can read in the entire underlying thing and close it, so then following with a read() throws an error about the file being closed already even though peek should not have that effect.

    Reproducible Steps:

    >>> r = BufferedReader(requests.get("https://google.com", stream=True).raw)
    >>> r.peek(2)[:2]
    b'\x1f\x8b'
    >>> r.peek()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: peek of closed file
    >>> r.read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: read of closed file

    However, in the case of a larger stream it appears to work since the underlying stream isn't closed yet:

    >>> r = BufferedReader(requests.get("https://amazon.com", stream=True).raw)
    >>> r.peek(2)[:2]
    b'\x1f\x8b'
    >>> r.peek(2)[:2]
    b'\x1f\x8b'

    This seems inconsistent at best. Best I can tell, the issue is here and needs to take the current buffer offset into account.
    https://github.com/python/cpython/blob/main/Modules/_io/bufferedio.c#L845

    @liquidpele liquidpele mannequin added 3.9 only security fixes topic-IO labels Jul 20, 2021
    @zooba
    Copy link
    Member

    zooba commented Sep 23, 2021

    Seems like it would be better to not check the file open/closed state on peek/read when there is still buffered data?

    @AngstyDuck
    Copy link
    Mannequin

    AngstyDuck mannequin commented Sep 24, 2021

    Yeap Steve, I was thinking the same thing. I've taken the buffered data into consideration when checking if the file is closed in my fix, feel free to take a look and let me know what you think :)

    @zooba
    Copy link
    Member

    zooba commented Sep 24, 2021

    That change makes me much happier :)

    Do you think we need a specific test added for this case? I'd guess we have plenty of coverage for the changed macro already, but clearly nothing that showed the issue before.

    @AngstyDuck
    Copy link
    Mannequin

    AngstyDuck mannequin commented Sep 25, 2021

    Awesome! To be honest I was very inclined to add test cases for this edge case, but while some existing test cases use native objects like io.BytesIO, the one that our edge case uses is HTTPResponse from urllib3.response.py. Should I still go ahead with writing the test cases in this case? I worry about creating new dependencies for the python repository.

    @zooba
    Copy link
    Member

    zooba commented Oct 1, 2021

    New changeset a450398 by AngstyDuck in branch 'main':
    bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
    a450398

    @zooba
    Copy link
    Member

    zooba commented Oct 1, 2021

    Thanks for sticking through it! Sorry about the delays, but I think we ended up with a good fix.

    @miss-islington
    Copy link
    Contributor

    New changeset 2221207 by Miss Islington (bot) in branch '3.10':
    bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
    2221207

    @zooba
    Copy link
    Member

    zooba commented Oct 1, 2021

    New changeset 6035d65 by Miss Islington (bot) in branch '3.9':
    bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
    6035d65

    @zooba zooba closed this as completed Oct 1, 2021
    @zooba zooba closed this as completed Oct 1, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes topic-IO
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants