This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mark.dickinson
Recipients MrJean1, amaury.forgeotdarc, loewis, mark.dickinson
Date 2008-11-22.18:41:18
SpamBayes Score 3.20167e-09
Marked as misclassified No
Message-id <>
It looks like your conjectures are right in both cases.

I tried adding a few lines to Modules/python.c to print out the argv 
entries as byte strings, before they're passed to mbstowcs.  Results
on OS X 10.5:

> 1. Somebody runs " ภาษาไทย" in a window. Most likely,
> the terminal encoding is applied, which we should assume to be UTF-8
> (although it might be different on some systems).

Yes, it appears that the terminal encoding is applied, if I'm reading 
the results right.  Trying

./python.exe é

with the terminal character encoding set to "Unicode (UTF-8)", Python 
receives the third argument as bytes([195, 169]).  With the terminal 
encoding set to "Western (ISO Latin 1)" instead, Python receives

> 2. Somebody creates a file japanese_コンテンツ in the finder, then uses
> shell completion to pass this to a Python script. Here I expect that
> UTF-8 is used even if the terminal's encoding is not UTF-8.

Yes.  Python seems to receive the same string regardless of terminal 
encoding.  (With the terminal encoding set to latin1, the tab-completed 
filename looks like garbage within Terminal, of course.)
Date User Action Args
2008-11-22 18:41:21mark.dickinsonsetrecipients: + mark.dickinson, loewis, amaury.forgeotdarc, MrJean1
2008-11-22 18:41:21mark.dickinsonsetmessageid: <>
2008-11-22 18:41:20mark.dickinsonlinkissue4388 messages
2008-11-22 18:41:18mark.dickinsoncreate