This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients amaury.forgeotdarc, doerwalter, eric.smith, ezio.melotti, flox, lemburg, vstinner
Date 2010-02-25.16:43:15
SpamBayes Score 2.220446e-16
Marked as misclassified No
Message-id <4B86A8A2.40202@egenix.com>
In-reply-to <1267109181.02.0.684765597255.issue7649@psf.upfronthosting.co.za>
Content
Ezio Melotti wrote:
> 
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
> 
> The latest patch (issue7649v4.diff) checks if the char is ASCII or non-ASCII and then, if the char is ASCII, it converts it directly to Unicode, otherwise it tries to decode it using the default encoding, raising a UnicodeDecodeError if the decoding fails.

Thanks. The patch looks good now... but doesn't apply cleanly anymore,
since your first version has already made it into trunk and the 2.6 branch.

> I tested it setting iso-8859-1 and utf-8 as default encoding and the behavior was consistent with "%s", however the tests assume that the default encoding is always ASCII, so they failed (both the tests included in the patch and others in test_unicode). I'm not sure if in this case they should be changed/skipped or not.

I think that's fine. While we do still allow setting the default
to something other than ASCII in 2.x, we don't support such tricks,
so there's no need to test for them.

> (Also http://docs.python.org/c-api/unicode.html#built-in-codecs says that "Setting encoding to NULL causes the default encoding to be used which is ASCII.", but this is not always true. If you think it should be fixed I'll do it in a separate commit.)

The last part of that sentence should be removed.
History
Date User Action Args
2010-02-25 16:43:17lemburgsetrecipients: + lemburg, doerwalter, amaury.forgeotdarc, vstinner, eric.smith, ezio.melotti, flox
2010-02-25 16:43:15lemburglinkissue7649 messages
2010-02-25 16:43:15lemburgcreate