Message 306705 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	lemburg, serhiy.storchaka
Date	2017-11-22.09:20:56
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<c0b50f25-ad09-6a23-05d7-16c1ecbdaa01@egenix.com>
In-reply-to	<1511336416.94.0.213398074469.issue32110@psf.upfronthosting.co.za>

Content
On 22.11.2017 08:40, Serhiy Storchaka wrote: > Usually the read() method of a file-like object takes one optional argument which limits the amount of data (the number of bytes or characters) returned if specified. > > codecs.StreamReader.read() also has such parameter. But this is the second parameter. The first parameter limits the number of bytes read for decoding. read(1) can return 70 characters, that will confuse most callers which expect either a single character or an empty string (at the end of stream). That's not true. .read(1) will at most read 1 byte from the stream and decode it. There's no way it will return 70 characters. It will usually return less chars than the number of bytes read. The reasoning here is the same as for .read() on regular byte streams in Python 2.x: the first argument size tells the reader how many bytes to read for decoding, since this is needed to properly work together with .seek(). The optional second parameter chars was added as convenience, since the user may not know how many bytes need to be read in order to decode a certain number of characters. That said, I see in your patch that you want to bind chars to size. That will work and also protect the user from the unlikely case where the codec returns more chars than bytes read.

On 22.11.2017 08:40, Serhiy Storchaka wrote:
> Usually the read() method of a file-like object takes one optional argument which limits the amount of data (the number of bytes or characters) returned if specified.
> 
> codecs.StreamReader.read() also has such parameter. But this is the second parameter. The first parameter limits the number of bytes read for decoding. read(1) can return 70 characters, that will confuse most callers which expect either a single character or an empty string (at the end of stream).

That's not true. .read(1) will at most read 1 byte from the stream
and decode it. There's no way it will return 70 characters. It will
usually return less chars than the number of bytes read.

The reasoning here is the same as for .read() on regular byte
streams in Python 2.x: the first argument size tells the reader how
many bytes to read for decoding, since this is needed to properly
work together with .seek().

The optional second parameter chars was added as convenience,
since the user may not know how many bytes need to be read in
order to decode a certain number of characters.

That said, I see in your patch that you want to bind chars
to size. That will work and also protect the user from the
unlikely case where the codec returns more chars than bytes
read.

History
Date	User	Action	Args
2017-11-22 09:20:56	lemburg	set	recipients: + lemburg, serhiy.storchaka
2017-11-22 09:20:56	lemburg	link	issue32110 messages
2017-11-22 09:20:56	lemburg	create