This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: io.BufferedReader:peek() closes underlying file, breaking peek/read expectations
Type: Stage: resolved
Components: IO Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: AngstyDuck, liquidpele, miss-islington, steve.dower
Priority: normal Keywords: patch

Created on 2021-07-20 20:42 by liquidpele, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28457 merged AngstyDuck, 2021-09-19 17:09
PR 28685 merged miss-islington, 2021-10-01 20:11
PR 28686 merged miss-islington, 2021-10-01 20:11
Messages (9)
msg397907 - (view) Author: liquidpele (liquidpele) Date: 2021-07-20 20:42
3c22fb7d6b37">reecepeg@3c22fb7d6b37 Pulpmill % python3 --version
Python 3.9.1


When buffering a small file, calling peek() can read in the entire underlying thing and close it, so then following with a read() throws an error about the file being closed already even though peek should not have that effect. 

Reproducible Steps:

>>> r = BufferedReader(requests.get("https://google.com", stream=True).raw)
>>> r.peek(2)[:2]
b'\x1f\x8b'
>>> r.peek()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: peek of closed file
>>> r.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: read of closed file


However, in the case of a larger stream it appears to work since the underlying stream isn't closed yet:

>>> r = BufferedReader(requests.get("https://amazon.com", stream=True).raw)
>>> r.peek(2)[:2]
b'\x1f\x8b'
>>> r.peek(2)[:2]
b'\x1f\x8b'


This seems inconsistent at best. Best I can tell, the issue is here and needs to take the current buffer offset into account. 
https://github.com/python/cpython/blob/main/Modules/_io/bufferedio.c#L845
msg402512 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-09-23 17:17
Seems like it would be better to not check the file open/closed state on peek/read when there is still buffered data?
msg402581 - (view) Author: Jun De (AngstyDuck) * Date: 2021-09-24 19:04
Yeap Steve, I was thinking the same thing. I've taken the buffered data into consideration when checking if the file is closed in my fix, feel free to take a look and let me know what you think :)
msg402591 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-09-24 22:19
That change makes me much happier :)

Do you think we need a specific test added for this case? I'd guess we have plenty of coverage for the changed macro already, but clearly nothing that showed the issue before.
msg402604 - (view) Author: Jun De (AngstyDuck) * Date: 2021-09-25 04:58
Awesome! To be honest I was very inclined to add test cases for this edge case, but while some existing test cases use native objects like io.BytesIO, the one that our edge case uses is HTTPResponse from `urllib3.response.py`. Should I still go ahead with writing the test cases in this case? I worry about creating new dependencies for the python repository.
msg403028 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-10-01 20:11
New changeset a450398933d265011e1e8eae7f771b70f97945fb by AngstyDuck in branch 'main':
bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
https://github.com/python/cpython/commit/a450398933d265011e1e8eae7f771b70f97945fb
msg403029 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-10-01 20:12
Thanks for sticking through it! Sorry about the delays, but I think we ended up with a good fix.
msg403032 - (view) Author: miss-islington (miss-islington) Date: 2021-10-01 20:30
New changeset 2221207f44c2abd34406ee590e6b13afaca53ee8 by Miss Islington (bot) in branch '3.10':
bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
https://github.com/python/cpython/commit/2221207f44c2abd34406ee590e6b13afaca53ee8
msg403034 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-10-01 20:46
New changeset 6035d650a322cec9619b306af2a877f3cead1580 by Miss Islington (bot) in branch '3.9':
bpo-44687: Ensure BufferedReader objects with unread buffers can peek even when the underlying file is closed (GH-28457)
https://github.com/python/cpython/commit/6035d650a322cec9619b306af2a877f3cead1580
History
Date User Action Args
2022-04-11 14:59:47adminsetgithub: 88853
2021-10-01 20:46:58steve.dowersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-10-01 20:46:31steve.dowersetmessages: + msg403034
2021-10-01 20:30:23miss-islingtonsetmessages: + msg403032
2021-10-01 20:12:04steve.dowersetmessages: + msg403029
2021-10-01 20:11:21miss-islingtonsetpull_requests: + pull_request27050
2021-10-01 20:11:17miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request27049
2021-10-01 20:11:11steve.dowersetmessages: + msg403028
2021-09-25 04:58:31AngstyDucksetmessages: + msg402604
2021-09-24 22:19:23steve.dowersetmessages: + msg402591
2021-09-24 19:04:07AngstyDucksetmessages: + msg402581
2021-09-23 17:17:09steve.dowersetnosy: + steve.dower
messages: + msg402512
2021-09-19 17:09:42AngstyDucksetkeywords: + patch
nosy: + AngstyDuck

pull_requests: + pull_request26859
stage: patch review
2021-07-20 20:42:34liquidpelecreate