This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ronaldoussoren
Recipients ezio.melotti, piro, ronaldoussoren, vstinner
Date 2010-07-23.14:17:21
SpamBayes Score 1.997733e-05
Marked as misclassified No
Message-id <1279894644.49.0.481918980196.issue9167@psf.upfronthosting.co.za>
In-reply-to
Content
Daniele: which version of OSX do you use?  And if you use OSX 10.5 or 10.6: which is your system language according to system preferences (the topmost entry in the list of the "Language and Text" preference pane, whose icon looks a little like a UN flag.

I can only reproduce this by explicitly setting LANG=C before running the test on OSX 10.6 (with English as the main language)

This may be very hard to fix. What happens is that subprocess.Popen converts the argument array into the filesystem encoding (which on OSX is always UTF-8). The argv decoder then decodes the using the encoding specified in LANG, which on your system is different from UTF-8. This results in a string where each byte in the UTF-8 encoding of snowman is represented as a single character. Those characters are then encoded as UTF-8 by the test and that results in the error your seeing.

That is, the output looks like the output of this code:

>>> snowman = '\u2603'
>>> snowman.encode('utf-8').decode('latin1').encode('utf-8')
History
Date User Action Args
2010-07-23 14:17:24ronaldoussorensetrecipients: + ronaldoussoren, vstinner, piro, ezio.melotti
2010-07-23 14:17:24ronaldoussorensetmessageid: <1279894644.49.0.481918980196.issue9167@psf.upfronthosting.co.za>
2010-07-23 14:17:22ronaldoussorenlinkissue9167 messages
2010-07-23 14:17:21ronaldoussorencreate