Message 334732 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	eryksun
Recipients	Neui, SilentGhost, eryksun, ncoghlan
Date	2019-02-01.23:49:22
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1549064962.38.0.0212201261109.issue35883@roundup.psfhosted.org>
In-reply-to

Content
In Unix, Python 3.6 decodes the char * command line arguments via mbstowcs. In Linux, I see the following misbehavior of mbstowcs when decoding an overlong UTF-8 sequence: >>> mbstowcs = ctypes.CDLL(None, use_errno=True).mbstowcs >>> arg = bytes(x + 128 for x in [1 + 124, 63, 63, 59, 58, 58]) >>> mbstowcs(None, arg, 0) 1 >>> buf = (ctypes.c_int * 2)() >>> mbstowcs(buf, arg, 2) 1 >>> hex(buf[0]) '0x7fffbeba' This shouldn't be an issue in 3.7, at least not with the default UTF-8 mode configuration. With this mode, Py_DecodeLocale calls _Py_DecodeUTF8Ex using the surrogateescape error handler [1]. [1]: https://github.com/python/cpython/blob/v3.7.2/Python/fileutils.c#L456

In Unix, Python 3.6 decodes the char * command line arguments via mbstowcs. In Linux, I see the following misbehavior of mbstowcs when decoding an overlong UTF-8 sequence:

    >>> mbstowcs = ctypes.CDLL(None, use_errno=True).mbstowcs
    >>> arg = bytes(x + 128 for x in [1 + 124, 63, 63, 59, 58, 58])
    >>> mbstowcs(None, arg, 0)
    1
    >>> buf = (ctypes.c_int * 2)()
    >>> mbstowcs(buf, arg, 2)
    1
    >>> hex(buf[0])
    '0x7fffbeba'

This shouldn't be an issue in 3.7, at least not with the default UTF-8 mode configuration. With this mode, Py_DecodeLocale calls _Py_DecodeUTF8Ex using the surrogateescape error handler [1].

[1]: https://github.com/python/cpython/blob/v3.7.2/Python/fileutils.c#L456

History
Date	User	Action	Args
2019-02-01 23:49:23	eryksun	set	recipients: + eryksun, ncoghlan, SilentGhost, Neui
2019-02-01 23:49:22	eryksun	set	messageid: <1549064962.38.0.0212201261109.issue35883@roundup.psfhosted.org>
2019-02-01 23:49:22	eryksun	link	issue35883 messages
2019-02-01 23:49:22	eryksun	create