This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Neui, SilentGhost, eryksun, ezio.melotti, jberg, ncoghlan, vstinner
Date 2021-03-13.13:19:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
> This shouldn't be an issue in 3.7, at least not with the default UTF-8 mode configuration. With this mode, Py_DecodeLocale calls _Py_DecodeUTF8Ex using the surrogateescape error handler [1].

Right, enabling explicitly the Python UTF-8 Mode works around the issue:

$ python3.10 -c 'import sys; print(ascii(sys.argv))' $'\U7fffbeba'
Fatal Python error: init_interp_main: failed to update the Python config
Python runtime state: core initialized
ValueError: character U+7fffbeba is not in range [U+0000; U+10ffff]

Current thread 0x00007effa1891740 (most recent call first):
<no Python frame>

$ python3.10 -X utf8 -c 'import sys; print(ascii(sys.argv))' $'\U7fffbeba'
['-c', '\udcfd\udcbf\udcbf\udcbb\udcba\udcba']
Date User Action Args
2021-03-13 13:19:29vstinnersetrecipients: + vstinner, ncoghlan, ezio.melotti, SilentGhost, eryksun, Neui, jberg
2021-03-13 13:19:29vstinnersetmessageid: <>
2021-03-13 13:19:29vstinnerlinkissue35883 messages
2021-03-13 13:19:29vstinnercreate