This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Neui, SilentGhost, eryksun, ezio.melotti, jberg, ncoghlan, vstinner
Date 2021-03-13.13:19:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1615641569.4.0.36167470291.issue35883@roundup.psfhosted.org>
In-reply-to
Content
> This shouldn't be an issue in 3.7, at least not with the default UTF-8 mode configuration. With this mode, Py_DecodeLocale calls _Py_DecodeUTF8Ex using the surrogateescape error handler [1].

Right, enabling explicitly the Python UTF-8 Mode works around the issue:
https://docs.python.org/dev/library/os.html#python-utf-8-mode


$ python3.10 -c 'import sys; print(ascii(sys.argv))' $'\U7fffbeba'
Fatal Python error: init_interp_main: failed to update the Python config
Python runtime state: core initialized
ValueError: character U+7fffbeba is not in range [U+0000; U+10ffff]

Current thread 0x00007effa1891740 (most recent call first):
<no Python frame>


$ python3.10 -X utf8 -c 'import sys; print(ascii(sys.argv))' $'\U7fffbeba'
['-c', '\udcfd\udcbf\udcbf\udcbb\udcba\udcba']
History
Date User Action Args
2021-03-13 13:19:29vstinnersetrecipients: + vstinner, ncoghlan, ezio.melotti, SilentGhost, eryksun, Neui, jberg
2021-03-13 13:19:29vstinnersetmessageid: <1615641569.4.0.36167470291.issue35883@roundup.psfhosted.org>
2021-03-13 13:19:29vstinnerlinkissue35883 messages
2021-03-13 13:19:29vstinnercreate