Message 153759 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	pitrou
Recipients	Anthony.Kong, berker.peksag, catalin.iacob, eric.araujo, petri.lehtinen, pitrou, poq, r.david.murray
Date	2012-02-20.02:21:45
SpamBayes Score	6.5352015e-06
Marked as misclassified	No
Message-id	<1329704305.1819.3.camel@localhost.localdomain>
In-reply-to	<1329704012.2.0.643319009488.issue13641@psf.upfronthosting.co.za>

Content
> OK' I'm back to being 100% on the side of rejecting both of these > changes. ASCII is not unocode, it is bytes. You can decode it to > unicode but it is not unicode. Those transformations operate bytes to > bytes, not bytes to unicode. ASCII is just a subset of the unicode character set. > We made the bytes unicode separation to avoid the problem where you > have a working program that unexpectedly gets non ASCII input and > blows up with a unicode error. How is blowing up with a unicode error worse than blowing up with a ValueError? Both indicate wrong input. At worse the code could catch UnicodeError and re-raise it as ValueError, but I don't see the point. > The programer should have to explicitly encode to ASCII if they are > inadvisedly workimg with it in a string as part of a wire protocol > (why else would they be using these transforms). Inadvisedly? There are many situations where you can have base64 data in some unicode strings.

> OK' I'm back to being 100% on the side of rejecting both of these
> changes.  ASCII is not unocode, it is bytes.  You can decode it to
> unicode but it is not unicode.  Those transformations operate bytes to
> bytes, not bytes to unicode.

ASCII is just a subset of the unicode character set.

> We made the bytes unicode separation to avoid the problem where you
> have a working program that unexpectedly gets non ASCII input and
> blows up with a unicode error.

How is blowing up with a unicode error worse than blowing up with a
ValueError? Both indicate wrong input. At worse the code could catch
UnicodeError and re-raise it as ValueError, but I don't see the point.

> The programer should have to explicitly encode to ASCII if they are
> inadvisedly workimg with it in a string as part of a wire protocol
> (why else would they be using these transforms).

Inadvisedly? There are many situations where you can have base64 data in
some unicode strings.

History
Date	User	Action	Args
2012-02-20 02:21:46	pitrou	set	recipients: + pitrou, eric.araujo, r.david.murray, catalin.iacob, petri.lehtinen, poq, Anthony.Kong, berker.peksag
2012-02-20 02:21:45	pitrou	link	issue13641 messages
2012-02-20 02:21:45	pitrou	create