BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1) #67403

vadmium · 2015-01-10T01:46:23Z

BPO	23214
Nosy	@pitrou, @vstinner, @benjaminp, @vadmium, @serhiy-storchaka
PRs	[Do Not Merge] Convert `Misc/NEWS` so that it is managed by towncrier #552
Files	read1-arbitrary.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-10-22.00:24:26.409>
created_at = <Date 2015-01-10.01:46:22.694>
labels = ['3.7', 'expert-IO']
title = 'BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1)'
updated_at = <Date 2017-03-31.16:36:34.664>
user = 'https://github.com/vadmium'

bugs.python.org fields:

activity = <Date 2017-03-31.16:36:34.664>
actor = 'dstufft'
assignee = 'none'
closed = True
closed_date = <Date 2016-10-22.00:24:26.409>
closer = 'martin.panter'
components = ['IO']
creation = <Date 2015-01-10.01:46:22.694>
creator = 'martin.panter'
dependencies = []
files = ['42213']
hgrepos = []
issue_num = 23214
keywords = ['patch']
message_count = 14.0
messages = ['233794', '233864', '261672', '261685', '261698', '261703', '262019', '279093', '279097', '279100', '279102', '279113', '279163', '279167']
nosy_count = 7.0
nosy_names = ['pitrou', 'vstinner', 'benjamin.peterson', 'stutzbach', 'python-dev', 'martin.panter', 'serhiy.storchaka']
pr_nums = ['552']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue23214'
versions = ['Python 3.7']

The text was updated successfully, but these errors were encountered:

vadmium · 2015-01-10T01:46:21Z

I am trying to make LZMAFile (which implements BufferedIOBase) use a BufferedReader in read mode. However this broke test_lzma.FileTestCase.test_read1_multistream(), which calls read1() with the default size argument. This is because BufferedReader.read1() does not accept size=-1:

>>> stdin = open(0, "rb", closefd=False)
>>> stdin
<_io.BufferedReader name=0>
>>> stdin.read1()  # Parameter is mandatory
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: read1() takes exactly 1 argument (0 given)
>>> stdin.read1(-1)  # Does not accept the BufferedIOBase default
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: read length must be positive
>>> stdin.read1(0)  # Technically not positive
b''

Also, the BufferedIOBase documentation does not say what the size=-1 value means, only that it reads and returns up to -1 bytes.

vadmium · 2015-01-11T23:23:00Z

Looking at the test suite:

read1() of LZMAFile and GzipFile (both implementing BufferedIOBase) are asserted to return a non-zero result until EOF
LZMAFile.read1(0) is asserted to return an empty string
BufferedReader.read1(-1) is asserted to raise ValueError
There are also tests of read1() methods on HTTPResponse and ZipFile.open() objects, but those methods are undocumented

It seems the most consistent way forward would be to:

Define BufferedIOBase.read1(-1) to read and return an arbitrary number of bytes, more than zero unless none are available due to EOF or non-blocking mode. Maybe suggest that it would return the current buffered data or try to read a full buffer of data (with one call) and return it if applicable.
Change the signature to BufferedReader.read1(size=-1) and implement the size=-1 behaviour

vadmium · 2016-03-13T06:02:01Z

Looking at this again, I think a less intrusive way forward would be to:

Document that in 3.6, the required signature is now BufferedIOBase.read1(size). An implementation no longer has to provide a default size, and no longer has to accept negative sizes.
Explicitly document the behaviour of each concrete implementation like GzipFile.read1(-1) etc, if this behaviour is intentional
Fix the BufferedReader error so that “read length must not be negative”

Relaxing the read1() signature would allow wider or easier use of BufferedReader, e.g. to implement HTTPResponse as I suggested in bpo-26499. The advantage would be using existing code that is well tested, used, optimized, etc, rather than a custom BufferedIOBase implementation which for the HTTP case is buggy.

serhiy-storchaka · 2016-03-13T07:44:24Z

Shouldn't read1(-1) be the same as read1(sys.maxsize)? I.e. read from a buffer without limit.

vadmium · 2016-03-13T12:01:16Z

Calling BufferedReader.read1(sys.maxsize) gives me a MemoryError. Making read1(-1) equivalent to read1(sys.maxsize) only makes sense where the return value already has a predetermined size, and only a limited buffer needs to be allocated.

Another interpretation is to return an arbitrary, modest buffer size. This is what I ended up doing with LZMAFile.read1() in bpo-23529: return no more than 8 KiB. It is not equivalent to sys.maxsize because more than 8 KiB is possible if you ask for it. HTTPResponse (for non-chunked responses) is similar, but uses a default of 16 KiB.

pitrou · 2016-03-13T18:38:22Z

Define BufferedIOBase.read1(-1) to read and return an arbitrary number
of bytes, more than zero unless none are available due to EOF or
non-blocking mode. Maybe suggest that it would return the current
buffered data or try to read a full buffer of data (with one call) and
return it if applicable.

This sounds reasonable to me.

vadmium · 2016-03-19T06:32:04Z

Okay here is a patch implementing read1(-1) in BufferedReader and BytesIO (my original proposal). Other changes:

Changed read1(size=-1) → read1([size]), because BufferedReader and BytesIO do not accept keyword arguments (see also bpo-23738)
Defined size=-1 to mean an arbitrary non-zero size
Change BufferedReader.read1() to return a buffer of data
Change BytesIO.read1() to read until EOF
Add tests to complement existing read1(size) tests for BufferedReader (includes BufferedRandom), BufferedRWPair, and BytesIO

python-dev · 2016-10-21T00:17:06Z

New changeset b6886ac88e28 by Martin Panter in branch 'default':
Issue bpo-23214: Implement optional BufferedReader, BytesIO read1() argument
https://hg.python.org/cpython/rev/b6886ac88e28

python-dev · 2016-10-21T02:00:18Z

New changeset d4fce64b1c65 by Martin Panter in branch 'default':
Issue bpo-23214: Remove BufferedReader.read1(-1) workaround
https://hg.python.org/cpython/rev/d4fce64b1c65

vstinner · 2016-10-21T02:28:15Z

.. method:: read1(size=-1)
+ .. method:: read1([size])

I don't understand this change: the default size is documented as -1. Did I miss something?

+ If *size* is −1 (the default)

Is the "−" character deliberate in "−1"? I would expected -1.

vadmium · 2016-10-21T03:17:35Z

The minus sign might have been deliberate at some stage, but I realize it is more accessible etc if it was ASCII -1, so I will change that.

The idea of changing the signature is to avoid people thinking it accepts a keyword argument. See e.g. bpo-25030 for seek(offset, whence=SEEK_SET), bpo-14586 for truncate(size=None).

vstinner · 2016-10-21T08:20:23Z

2016-10-21 5:17 GMT+02:00 Martin Panter <report@bugs.python.org>:

The idea of changing the signature is to avoid people thinking it accepts a keyword argument. See e.g. bpo-25030 for seek(offset, whence=SEEK_SET), bpo-14586 for truncate(size=None).

Ah. Maybe we should modify the code to accept a keyword argument? :-)

Or we might use the strange "read1(\, size=1)" syntax of Argument Clinic.

vadmium · 2016-10-21T21:51:35Z

Modifying the API to accept a keyword argument is not worthwhile IMO. That means modifying modules like http.client and zipfile (which use the wrong keyword name), and user-defined implementations may become incompatible.

The question of documentation is discussed more in bpo-23738. I think one or two people were concerned about using the Arg Clinic slash.

python-dev · 2016-10-21T23:00:52Z

New changeset 2af6d94de492 by Martin Panter in branch 'default':
Issue bpo-23214: Fix formatting of -1
https://hg.python.org/cpython/rev/2af6d94de492

vadmium added the topic-IO label Jan 10, 2015

vadmium added the 3.7 (EOL) end of life label Oct 21, 2016

vadmium closed this as completed Oct 22, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

smason mentioned this issue Nov 14, 2023

Support read1() method in HTTPResponse urllib3/urllib3#3186

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1) #67403

BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1) #67403

vadmium commented Jan 10, 2015

vadmium commented Jan 10, 2015

vadmium commented Jan 11, 2015

vadmium commented Mar 13, 2016

serhiy-storchaka commented Mar 13, 2016

vadmium commented Mar 13, 2016

pitrou commented Mar 13, 2016

vadmium commented Mar 19, 2016

python-dev mannequin commented Oct 21, 2016

python-dev mannequin commented Oct 21, 2016

vstinner commented Oct 21, 2016

vadmium commented Oct 21, 2016

vstinner commented Oct 21, 2016

vadmium commented Oct 21, 2016

python-dev mannequin commented Oct 21, 2016

BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1) #67403

BufferedReader.read1(size) signature incompatible with BufferedIOBase.read1(size=-1) #67403

Comments

vadmium commented Jan 10, 2015

vadmium commented Jan 10, 2015

vadmium commented Jan 11, 2015

vadmium commented Mar 13, 2016

serhiy-storchaka commented Mar 13, 2016

vadmium commented Mar 13, 2016

pitrou commented Mar 13, 2016

vadmium commented Mar 19, 2016

python-dev mannequin commented Oct 21, 2016

python-dev mannequin commented Oct 21, 2016

vstinner commented Oct 21, 2016

vadmium commented Oct 21, 2016

vstinner commented Oct 21, 2016

vadmium commented Oct 21, 2016

python-dev mannequin commented Oct 21, 2016