This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients amaury.forgeotdarc, doerwalter, eric.smith, ezio.melotti, flox, lemburg, vstinner
Date 2010-02-25.14:46:17
SpamBayes Score 3.644862e-12
Marked as misclassified No
Message-id <1267109181.02.0.684765597255.issue7649@psf.upfronthosting.co.za>
In-reply-to
Content
The latest patch (issue7649v4.diff) checks if the char is ASCII or non-ASCII and then, if the char is ASCII, it converts it directly to Unicode, otherwise it tries to decode it using the default encoding, raising a UnicodeDecodeError if the decoding fails.

I tested it setting iso-8859-1 and utf-8 as default encoding and the behavior was consistent with "%s", however the tests assume that the default encoding is always ASCII, so they failed (both the tests included in the patch and others in test_unicode). I'm not sure if in this case they should be changed/skipped or not.

(Also http://docs.python.org/c-api/unicode.html#built-in-codecs says that "Setting encoding to NULL causes the default encoding to be used which is ASCII.", but this is not always true. If you think it should be fixed I'll do it in a separate commit.)
History
Date User Action Args
2010-02-25 14:46:21ezio.melottisetrecipients: + ezio.melotti, lemburg, doerwalter, amaury.forgeotdarc, vstinner, eric.smith, flox
2010-02-25 14:46:21ezio.melottisetmessageid: <1267109181.02.0.684765597255.issue7649@psf.upfronthosting.co.za>
2010-02-25 14:46:19ezio.melottilinkissue7649 messages
2010-02-25 14:46:18ezio.melotticreate