This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients Arfrever, amaury.forgeotdarc, brett.cannon, lemburg, loewis, pitrou, vstinner
Date 2010-09-29.17:39:02
SpamBayes Score 0.0
Marked as misclassified No
Message-id <4CA379B3.8010300@egenix.com>
In-reply-to <201009291417.59634.victor.stinner@haypocalc.com>
Content
STINNER Victor wrote:
> 
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
> 
> Le mercredi 29 septembre 2010 13:45:15, vous avez écrit :
>> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>>
>> STINNER Victor wrote:
>>> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>>>
>>> Forget my previous message, I forgot important points.
>>>
>>>> So the only reason why you have to go through
>>>> all those hoops is to
>>>>
>>>> * allow the complete set of Python supported encoding
>>>>
>>>>   names for the PYTHONFSENCODING
>>>>
>>>> * make sure that the Py_FilesystemDefaultEncoding is set
>>>>
>>>>   to the actual name of the codec as used by the system
>>>
>>> Not only. As I wrote in my first message (msg114191), there are two
>>>
>>> other good reasons to keep the current code but redecode filenames:
>>>  * Encoding aliases: locale encoding is not always written as the
>>>  
>>>    official Python encoding name. Eg. utf8 vs UTF-8, iso8859-1 vs
>>>    latin_1, etc. We have to be able to load Lib/encodings/aliases.py to
>>>    to get the Python codec.
>>>  
>>>  * Codecs implemented in Python: only ascii, latin1, utf8 and mbcs
>>>  
>>>    codecs are builtin. All other encodings are implemented in Python. If
>>>    your filesystem encoding is ShiftJIS, you have to load
>>>    Lib/encodings/shift_jis.py to load the codec.
>>>
>>> For these two reasons, we have to import Python modules before being
>>> able to set the filesystem encoding. So we have to redecode filenames
>>> after setting the filesystem encodings.
>>
>> No, that's not needed ! Please see my earlier message: you can still
>> do all this at a later time during startup and double-check that
>> the encoding is indeed valid.
> 
> I don't understand how. Eg. if you set Py_FileSystemDefaultEncoding to 
> "cp1252" before loading the first module, import a module will have to load the 
> codec. Load the codec require to import a module. But how can you open cp1252 
> module since you are unable to encode paths to the filesystem encoding (because 
> the cp1252 codec is not available yet)?

Ah, sorry, I forgot about that important circular reference :-)

You're right: there's no way to guarantee that file and path
decoding will work without first setting the file system encoding
to one of the builin codec names (latin-1 would be a good choice).

The other option would be to import everything using relative
paths (since Python itself only uses ASCII path names to the modules),
until the codec is loaded and then add the absolute paths to
these relative ones, once the codec has been loaded successfully.

A third option is the one you mentioned earlier on: we simply
don't allow Python to be installed on paths that are not
decodable using one of the builtin codecs.

>> If the decoding fails during startup due to a wrong encoding of
>> file or path names, ...
> 
> It is not not problem described in my previous message. How do you load non-
> builtin codecs?
> 
> Can you write a patch implementing your ideas? I tried to write such patch 
> (set Py_FileSystemDefaultEncoding before loading the first module), but it 
> doesn't work for different reasons (all described in this issue). Maybe I 
> misunderstood your proposition.

No, I wasn't thinking of the situation where you want to use a
codec that requires a Python module.
History
Date User Action Args
2010-09-29 17:39:05lemburgsetrecipients: + lemburg, loewis, brett.cannon, amaury.forgeotdarc, pitrou, vstinner, Arfrever
2010-09-29 17:39:03lemburglinkissue9630 messages
2010-09-29 17:39:03lemburgcreate