New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python3: use ASCII for the file system encoding on initfsencoding() failure #52971
Comments
I introduced initfsencoding() in bpo-8610 to ensure that Py_FileSystemEncoding is not more NULL. In the discussion, Marc Lemburg noticed that falling back the UTF-8 on nl_langinfo(CODESET) error is a bad idea: ASCII is better (I agree). We cannot fall back to ASCII yet because there are two other problems that have to be fixed before that:
Attached patch is a partial fix for this issue. |
PyUnicode_AsEncodedString() contains a special path for the file system encoding. I don't think that it is still needed, but I don't know how to check that. => read msg105810 |
Version 2:
|
I tried the patch on my import_unicode branch and it doesn't work if the locale encoding is not ASCII (as the current code doesn't work if the locale encoding is not UTF-8, bpo-8611). If Py_FileSystemUnicodeEncoding is NULL: PyUnicode_EncodeFSDefault() should use mbcstowcs() and PyUnicode_DecodeFSDefault() should use wcstombcs(). They may reuse _Py_wchar2char() and _Py_char2wchar(). "ascii" should be used in initfsencoding(). |
initfsencoding() now raises a fatal error on get_codeset() error. Use a encoding different than the locale encoding on get_codeset() only leads to mojibake and encoding issues, it's not a good idea. Close this issue as invalid. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: