Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject embedded null characters in wchar* strings #57826

Closed
vstinner opened this issue Dec 17, 2011 · 22 comments
Closed

Reject embedded null characters in wchar* strings #57826

vstinner opened this issue Dec 17, 2011 · 22 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-unicode type-security A security issue

Comments

@vstinner
Copy link
Member

BPO 13617
Nosy @vstinner, @ezio-melotti, @benhoyt, @serhiy-storchaka
PRs
  • [security] bpo-13617: Reject embedded null characters in wchar* strings. #2302
  • [3.6] bpo-13617: Reject embedded null characters in wchar* strings. (GH-2302) #2462
  • [3.5] bpo-13617: Reject embedded null characters in wchar* strings. (GH-2302) #2463
  • Fix compiler warnings introduced in bpo-13617. #2464
  • Dependencies
  • bpo-30708: Ensure that the result of PyUnicode_AsWideCharString() doesn't contain null characters if size is not returned
  • Files
  • embedded_nul.patch
  • doc_unicode.patch: added warnings to concerned functions
  • doc_unicode-2.patch
  • embedded_nul-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2017-06-28.08:22:41.245>
    created_at = <Date 2011-12-17.03:53:41.721>
    labels = ['type-security', '3.7', 'library', 'expert-unicode']
    title = 'Reject embedded null characters in wchar* strings'
    updated_at = <Date 2017-06-28.09:24:05.176>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-06-28.09:24:05.176>
    actor = 'vstinner'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2017-06-28.08:22:41.245>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)', 'Unicode']
    creation = <Date 2011-12-17.03:53:41.721>
    creator = 'vstinner'
    dependencies = ['30708']
    files = ['23984', '24034', '24037', '24041']
    hgrepos = []
    issue_num = 13617
    keywords = ['patch']
    message_count = 22.0
    messages = ['149656', '149657', '149783', '149787', '149791', '149794', '149802', '173147', '221844', '226414', '236699', '296241', '296332', '296461', '296512', '297132', '297155', '297156', '297158', '297161', '297162', '297165']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'ezio.melotti', 'benhoyt', 'python-dev', 'arnaudc', 'serhiy.storchaka']
    pr_nums = ['2302', '2462', '2463', '2464']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue13617'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    The curses module (only since Python 3.3), locale.strcoll(), locale.strxfrm(), time.strftime() and imp.NullImporter() (only on Windows) accept embedded null characters, whereas they convert the Unicode string to a wide character (wchar_t*) string.

    The problem is that the null character truncates the string. Example:

    >>> locale.strxfrm('a')
    'a'
    >>> locale.strxfrm('a\0b')
    'a'

    Attached patch fixes these functions. I wrote the patch for Python 3.3.

    @vstinner vstinner added stdlib Python modules in the Lib dir topic-unicode labels Dec 17, 2011
    @vstinner
    Copy link
    Member Author

    PyUnicode_AsWideCharString() documentation should also warn about this issue.

    @arnaudc
    Copy link
    Mannequin

    arnaudc mannequin commented Dec 18, 2011

    Here is a patch for the documentation.

    I added warnings for, PyUnicode_AsWideChar*, PyUnicode_EncodeFSDefault and PyUnicode_AsUnicode*, since they're all concerned by this issue.

    @arnaudc
    Copy link
    Mannequin

    arnaudc mannequin commented Dec 18, 2011

    I removed the hints "using wcslen on the result of PyUnicode_AsWideChar*", since the resulting wchar_t strings may not be null-terminated

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 18, 2011

    New changeset fa5c8cf29963 by Victor Stinner in branch '3.2':
    Issue bpo-13617: Document that the result of the conversion of a Unicode object to
    http://hg.python.org/cpython/rev/fa5c8cf29963

    New changeset f30ac7729f2b by Victor Stinner in branch 'default':
    Issue bpo-13617: Document that the result of the conversion of a Unicode object to
    http://hg.python.org/cpython/rev/f30ac7729f2b

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 18, 2011

    New changeset 1c4d9534263e by Victor Stinner in branch '2.7':
    Issue bpo-13617: Document that the result PyUnicode_AsUnicode() and
    http://hg.python.org/cpython/rev/1c4d9534263e

    @vstinner
    Copy link
    Member Author

    embedded_nul-2.patch: a more complete patch check also null byte in functions calling PyUnicode_EncodeFSDefault().

    @serhiy-storchaka
    Copy link
    Member

    I added some comments in Rietveld.

    I see other instances of the use of non-checked PyUnicode_AsWideCharString() and PyUnicode_AsUnicode().

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jun 29, 2014

    @victor can you pick this up again please.

    @serhiy-storchaka
    Copy link
    Member

    Could you please answer my comments Victor?

    @benhoyt
    Copy link
    Mannequin

    benhoyt mannequin commented Feb 26, 2015

    Note that this (or a very similar issue) also affects os.listdir() on Windows: os.listdir(bytes_path_with_nul) raises ValueError as expected, but os.listdir(unicode_path_with_nul) does not. Test case:

    >>> import os
    >>> os.mkdir('foo')
    >>> os.listdir(b'foo\x00zzz')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: listdir: embedded null character in path
    >>> os.listdir('foo\x00zzz')
    []

    However, this is not the case on Linux, as there both calls raise an appropriate ValueError.

    This needs to be fixed in posixmodule.c's path_converter() function.

    I'm in the middle of implementing PEP-471 (os.scandir), so don't want to create a proper patch right now, but the fix is to add these lines in posixmodule.c path_converter() after the if (length > 32767) {...} block:

            if ((size_t)length != wcslen(wide)) {
                FORMAT_EXCEPTION(PyExc_ValueError, "embedded null character in %s");
                Py_DECREF(unicode);
                return 0;
            }

    We should also add test to test_os.py like the following:

        def test_listdir_nul_in_path(self):
            self.assertRaises(ValueError, os.listdir, 'y\x00z')
            self.assertRaises(ValueError, os.listdir, b'y\x00z')

    @serhiy-storchaka
    Copy link
    Member

    Could you update your patch Victor?

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life type-bug An unexpected behavior, bug, or error labels Jun 17, 2017
    @vstinner
    Copy link
    Member Author

    Sorry, I lost track of this issue. Feel free to update and complete my patch :-)

    @serhiy-storchaka serhiy-storchaka self-assigned this Jun 20, 2017
    @serhiy-storchaka
    Copy link
    Member

    PR 2302 doesn't fix all issues with PyUnicode_AsWideCharString(). bpo-30708 should fix them.

    @serhiy-storchaka
    Copy link
    Member

    While working on this issue I found a way to inject environment variables for a subprocess on Windows. Reclassified this issue as a security issue. PR 2302 fixes this. May be there are other security vulnerabilities fixed by it.

    @serhiy-storchaka serhiy-storchaka added type-security A security issue and removed type-bug An unexpected behavior, bug, or error labels Jun 21, 2017
    @vstinner
    Copy link
    Member Author

    Wow, it's nice to see activity on this issue that I opened 6 years ago :-)

    Sorry Serhiy, I don't have the bandwidth right now to review your change :-( In lack of review, I suggest you to just push it.

    @serhiy-storchaka
    Copy link
    Member

    New changeset f7eae0a by Serhiy Storchaka in branch 'master':
    [security] bpo-13617: Reject embedded null characters in wchar* strings. (bpo-2302)
    f7eae0a

    @serhiy-storchaka
    Copy link
    Member

    New changeset 0834905 by Serhiy Storchaka in branch '3.6':
    [3.6] bpo-13617: Reject embedded null characters in wchar* strings. (GH-2302) (bpo-2462)
    0834905

    @serhiy-storchaka
    Copy link
    Member

    New changeset ccdc09e by Serhiy Storchaka in branch 'master':
    Fix compiler warnings on Windows introduced in bpo-13617. (bpo-2464)
    ccdc09e

    @serhiy-storchaka
    Copy link
    Member

    New changeset 54ba940 by Serhiy Storchaka in branch '3.5':
    [3.5] bpo-13617: Reject embedded null characters in wchar* strings. (GH-2302) (bpo-2463)
    54ba940

    @serhiy-storchaka
    Copy link
    Member

    Backporting this to 2.7 requires too much work taking to account that PyArg_Parse and other argument parsing functions don't check for null characters in 2.7. The most serious security issue is fixed in bpo-30730, other cases unlikely can be used for attacks.

    @vstinner
    Copy link
    Member Author

    Thank you very much Serhiy of taking care of this bug!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-unicode type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants