Python3: use ASCII for the file system encoding on initfsencoding() failure #52971

vstinner · 2010-05-15T12:39:06Z

BPO	8725
Nosy	@malemburg, @loewis, @pitrou, @vstinner
Dependencies	bpo-8611: Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX) bpo-8715: Create PyUnicode_EncodeFSDefault() function
Files	fsencoding_ascii-2.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2010-10-19.23:55:25.879>
created_at = <Date 2010-05-15.12:39:05.650>
labels = ['interpreter-core', 'invalid', 'expert-unicode']
title = 'Python3: use ASCII for the file system encoding on initfsencoding() failure'
updated_at = <Date 2010-10-19.23:55:25.878>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2010-10-19.23:55:25.878>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2010-10-19.23:55:25.879>
closer = 'vstinner'
components = ['Interpreter Core', 'Unicode']
creation = <Date 2010-05-15.12:39:05.650>
creator = 'vstinner'
dependencies = ['8611', '8715']
files = ['17357']
hgrepos = []
issue_num = 8725
keywords = ['patch']
message_count = 5.0
messages = ['105804', '105820', '105842', '111758', '119180']
nosy_count = 5.0
nosy_names = ['lemburg', 'loewis', 'pitrou', 'vstinner', 'Arfrever']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue8725'
versions = ['Python 3.2']

vstinner · 2010-05-15T12:39:04Z

I introduced initfsencoding() in bpo-8610 to ensure that Py_FileSystemEncoding is not more NULL. In the discussion, Marc Lemburg noticed that falling back the UTF-8 on nl_langinfo(CODESET) error is a bad idea: ASCII is better (I agree).

We cannot fall back to ASCII yet because there are two other problems that have to be fixed before that:

Python3 doesn't support surrogates in module filenames: see bpo-8611
If Py_FileSystemEncoding is NULL, encoding functions fallback to utf-8 (PyUnicode_GetDefaultEncoding()). bpo-8715 proposes a new PyUnicode_EncodeFSDefault() function to fix this problem

Attached patch is a partial fix for this issue.

vstinner · 2010-05-15T16:34:20Z

PyUnicode_AsEncodedString() contains a special path for the file system encoding. I don't think that it is still needed, but I don't know how to check that. => read msg105810

vstinner · 2010-05-16T01:13:05Z

Version 2:

bpo-8715 has been commited: patch PyUnicode_EncodeFSDefault()
fix the documentation according the changes

vstinner · 2010-07-28T01:29:23Z

I tried the patch on my import_unicode branch and it doesn't work if the locale encoding is not ASCII (as the current code doesn't work if the locale encoding is not UTF-8, bpo-8611).

If Py_FileSystemUnicodeEncoding is NULL: PyUnicode_EncodeFSDefault() should use mbcstowcs() and PyUnicode_DecodeFSDefault() should use wcstombcs(). They may reuse _Py_wchar2char() and _Py_char2wchar().

"ascii" should be used in initfsencoding().

vstinner · 2010-10-19T23:55:26Z

initfsencoding() now raises a fatal error on get_codeset() error. Use a encoding different than the locale encoding on get_codeset() only leads to mojibake and encoding issues, it's not a good idea. Close this issue as invalid.

vstinner added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode labels May 15, 2010

vstinner closed this as completed Oct 19, 2010

vstinner added the invalid label Oct 19, 2010

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python3: use ASCII for the file system encoding on initfsencoding() failure #52971

Python3: use ASCII for the file system encoding on initfsencoding() failure #52971

vstinner commented May 15, 2010

vstinner commented May 15, 2010

vstinner commented May 15, 2010

vstinner commented May 16, 2010

vstinner commented Jul 28, 2010

vstinner commented Oct 19, 2010

Python3: use ASCII for the file system encoding on initfsencoding() failure #52971

Python3: use ASCII for the file system encoding on initfsencoding() failure #52971

Comments

vstinner commented May 15, 2010

vstinner commented May 15, 2010

vstinner commented May 15, 2010

vstinner commented May 16, 2010

vstinner commented Jul 28, 2010

vstinner commented Oct 19, 2010