Message 129208 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	belopolsky, georg.brandl, jcea, lemburg, mark.dickinson, ncoghlan, pitrou
Date	2011-02-23.15:35:24
SpamBayes Score	5.2902127e-14
Marked as misclassified	No
Message-id	<AANLkTimc0=6QtqnKXxmp5hZqVxj5H2+qMfWN_kdjnru5@mail.gmail.com>
In-reply-to	<1298474556.3710.54.camel@localhost.localdomain>

Content
On Wed, Feb 23, 2011 at 10:22 AM, Antoine Pitrou <report@bugs.python.org> wrote: .. > Well, a theoretical argument could be made that some codec could return > a non-empty string when asked to decode an empty bytestring, but I'm not > sure it has much practical worth :) I was thinking about that as well. Note that the opposite is quite common, for example any encoding that uses BOM will turn empty unicode string into a non-empty byte string. I don't think a codec that decodes b'' into non-empty string exists, but it would be reasonable for a codec that requires BOM or some other metadata to reject raise an error on b''. If we rely on decode(b'') == '', these errors will go unnoticed. I am going to prepare a Unpickler_Read() patch with tests and play with it a little. It is a good idea to separate performance optimizations from bug fixes anyways. If we want to bypass decode on empty strings, we can do it independently.

On Wed, Feb 23, 2011 at 10:22 AM, Antoine Pitrou <report@bugs.python.org> wrote:
..
> Well, a theoretical argument could be made that some codec could return
> a non-empty string when asked to decode an empty bytestring, but I'm not
> sure it has much practical worth :)

I was thinking about that as well.  Note that the opposite is quite
common, for example any encoding that uses BOM will turn empty unicode
string into a non-empty byte string.  I don't think a codec that
decodes b'' into non-empty string exists, but it would be reasonable
for a codec that requires BOM or some other metadata to reject raise
an error on b''.  If we rely on decode(b'') == '', these errors will
go unnoticed.  I am going to prepare a Unpickler_Read() patch with
tests and play with it a little.  It is a good idea to separate
performance optimizations from bug fixes anyways.  If we want to
bypass decode on empty strings, we can do it independently.

History
Date	User	Action	Args
2011-02-23 15:35:25	belopolsky	set	recipients: + belopolsky, lemburg, georg.brandl, jcea, mark.dickinson, ncoghlan, pitrou
2011-02-23 15:35:24	belopolsky	link	issue11286 messages
2011-02-23 15:35:24	belopolsky	create