classification
Title: io.BufferedReader:peek() closes underlying file, breaking peek/read expectations
Type: Stage: patch review
Components: IO Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: AngstyDuck, liquidpele
Priority: normal Keywords: patch

Created on 2021-07-20 20:42 by liquidpele, last changed 2021-09-19 17:09 by AngstyDuck.

Pull Requests
URL Status Linked Edit
PR 28457 open AngstyDuck, 2021-09-19 17:09
Messages (1)
msg397907 - (view) Author: liquidpele (liquidpele) Date: 2021-07-20 20:42
3c22fb7d6b37">reecepeg@3c22fb7d6b37 Pulpmill % python3 --version
Python 3.9.1


When buffering a small file, calling peek() can read in the entire underlying thing and close it, so then following with a read() throws an error about the file being closed already even though peek should not have that effect. 

Reproducible Steps:

>>> r = BufferedReader(requests.get("https://google.com", stream=True).raw)
>>> r.peek(2)[:2]
b'\x1f\x8b'
>>> r.peek()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: peek of closed file
>>> r.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: read of closed file


However, in the case of a larger stream it appears to work since the underlying stream isn't closed yet:

>>> r = BufferedReader(requests.get("https://amazon.com", stream=True).raw)
>>> r.peek(2)[:2]
b'\x1f\x8b'
>>> r.peek(2)[:2]
b'\x1f\x8b'


This seems inconsistent at best. Best I can tell, the issue is here and needs to take the current buffer offset into account. 
https://github.com/python/cpython/blob/main/Modules/_io/bufferedio.c#L845
History
Date User Action Args
2021-09-19 17:09:42AngstyDucksetkeywords: + patch
nosy: + AngstyDuck

pull_requests: + pull_request26859
stage: patch review
2021-07-20 20:42:34liquidpelecreate