You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee=Noneclosed_at=<Date2009-05-29.16:22:46.178>created_at=<Date2009-05-24.18:03:31.100>labels= ['type-bug']
title='Encoded surrogate characters on command line not escaped in sys.argv'updated_at=<Date2009-05-29.16:22:46.142>user='https://bugs.python.org/baikie'
The mbstowcs and mbrtwoc functions which are used for the initial
conversion of command-line arguments on Unix can return lone or
paired surrogates (e.g. \udcff for \xed\xb3\xbf in non-strict
UTF-8), and these surrogates are currently placed into sys.argv
unescaped. This creates various problems such as strings that
cannot be re-encoded into bytes and strings that could represent
more than one byte sequence. Examples follow using the following
script in a UTF-8 locale on Linux:
$ ./python argtest.py $'\xed\xa0\x80'
'\ud800'
Traceback (most recent call last):
File "argtest.py", line 6, in <module>print(repr(sys.argv[1].encode(sys.getfilesystemencoding(),
"surrogateescape")))
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in
position 0: surrogates not allowed
$ ./python argtest.py $'\xed\xb0\x80'
'\udc00'
Traceback (most recent call last):
File "argtest.py", line 6, in <module>print(repr(sys.argv[1].encode(sys.getfilesystemencoding(),
"surrogateescape")))
UnicodeEncodeError: 'utf-8' codec can't encode character '\udc00' in
position 0: surrogates not allowed
Aliasing between non-decodable bytes and encoded lone surrogates:
Attached is a patch to fix these problems by replacing any
decoded characters in the range 0xd800...0xdfff with the
surrogateescape encodings of their source bytes.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: