Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_PyUnicode_Fini should invalidate ucnhash_capi capsule pointer #91338

Closed
tiran opened this issue Mar 31, 2022 · 4 comments
Closed

_PyUnicode_Fini should invalidate ucnhash_capi capsule pointer #91338

tiran opened this issue Mar 31, 2022 · 4 comments
Labels
3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@tiran
Copy link
Member

tiran commented Mar 31, 2022

BPO 47182
Nosy @vstinner, @tiran, @ambv, @pablogsal, @miss-islington, @neonene
PRs
  • bpo-47182: Fix crash by named unicode characters after interpreter reinitialization #32212
  • [3.10] bpo-47182: Fix crash by named unicode characters after interpreter reinitialization (GH-32212) (GH-32216) #32216
  • [3.9] bpo-47182: Fix crash by named unicode characters after interpreter reinitialization (GH-32212) #32217
  • gh-91338: Ensure test_ucnhash_capi_reset doesn't pass PYTHONHOME to _testembed.exe #32313
  • Files
  • ucnbug.c
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-03-31.13:34:15.298>
    labels = ['interpreter-core', '3.10', 'type-crash', '3.11']
    title = '_PyUnicode_Fini should invalidate ucnhash_capi capsule pointer'
    updated_at = <Date 2022-04-04.19:17:34.058>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2022-04-04.19:17:34.058>
    actor = 'neonene'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2022-03-31.13:34:15.298>
    creator = 'christian.heimes'
    dependencies = []
    files = ['50708']
    hgrepos = []
    issue_num = 47182
    keywords = ['patch']
    message_count = 3.0
    messages = ['416432', '416440', '416474']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'christian.heimes', 'lukasz.langa', 'pablogsal', 'miss-islington', 'neonene']
    pr_nums = ['32212', '32216', '32217', '32313']
    priority = 'high'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue47182'
    versions = ['Python 3.10', 'Python 3.11']

    @tiran
    Copy link
    Member Author

    tiran commented Mar 31, 2022

    unicodeobject.c has a static pointer to a unicode name CAPI capsule:

    static _PyUnicode_Name_CAPI *ucnhash_capi = NULL;

    The capsule is initialized on demand when the parser encounters a named unicode representation like "\N{digit nine}". Once the capsule pointer ucnhash_capi has been initialized, it is never reset. Not even a full interpreter shutdown invalidates the pointer.

    A shutdown of the main interpreter with Py_Finalize() renders the pointer invalid. If the interpreter is re-initialized again, the invalid pointer causes a segfault. The problem was first discovered by Trey Hunner in ethanhs/python-wasm#69

    python.js:219 Uncaught RuntimeError: null function or function signature mismatch
    at _PyUnicode_DecodeUnicodeEscapeInternal (unicodeobject.c:6493:25)
    at decode_unicode_with_escapes (string_parser.c:121:13)
    at _PyPegen_parsestr (string_parser.c:273:1)
    at strings_rule (action_helpers.c:901:20)
    at atom_rule (parser.c:14293:27)
    at primary_rule (parser.c:13916:17)
    at await_primary_rule (parser.c:13666:17)
    at factor_rule (parser.c:13542:29)
    at term_rule (parser.c:13330:17)
    at sum_rule (parser.c:13044:17)

    I can reproduce the issue with pure C code:

    $ gcc -Xlinker -export-dynamic -g -IInclude/ -I. -o ucnbug ucnbug.c libpython3.11.a -lm -ldl
    $ gdb ucnbug
    (gdb) run
    0
    9
    Done

    1

    Program received signal SIGSEGV, Segmentation fault.
    0x0000000000000000 in ?? ()
    (gdb) bt
    #0 0x0000000000000000 in ?? ()
    #1 0x00000000005729a8 in _PyUnicode_DecodeUnicodeEscapeInternal (s=<optimized out>, s@entry=0x7fffea53b6d0 "\\N{digit nine}", size=<optimized out>, errors=errors@entry=0x0,
    consumed=consumed@entry=0x0, first_invalid_escape=first_invalid_escape@entry=0x7fffffffc748) at Objects/unicodeobject.c:6490
    #2 0x0000000000644fe3 in decode_unicode_with_escapes (parser=parser@entry=0x7fffea5e45d0, s=0x7fffea53b6d0 "\\N{digit nine}", s@entry=0x7fffea6af1d1 "\\N{digit nine}'", len=<optimized out>,
    len@entry=14, t=t@entry=0x7fffea606910) at Parser/string_parser.c:118
    #3 0x0000000000645675 in _PyPegen_parsestr (p=p@entry=0x7fffea5e45d0, bytesmode=bytesmode@entry=0x7fffffffc838, rawmode=rawmode@entry=0x7fffffffc83c, result=result@entry=0x7fffffffc848,
    fstr=fstr@entry=0x7fffffffc850, fstrlen=fstrlen@entry=0x7fffffffc858, t=0x7fffea606910) at Parser/string_parser.c:269
    #4 0x0000000000644163 in _PyPegen_concatenate_strings (p=p@entry=0x7fffea5e45d0, strings=strings@entry=0x94e310) at Parser/action_helpers.c:896
    #5 0x00000000004791e6 in strings_rule (p=p@entry=0x7fffea5e45d0) at Parser/parser.c:15463
    #6 0x000000000047c498 in atom_rule (p=p@entry=0x7fffea5e45d0) at Parser/parser.c:14274
    #7 0x000000000047e159 in primary_raw (p=0x7fffea5e45d0) at Parser/parser.c:13908
    #8 primary_rule (p=p@entry=0x7fffea5e45d0) at Parser/parser.c:13706

    @tiran tiran added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Mar 31, 2022
    @miss-islington
    Copy link
    Contributor

    New changeset 44e9150 by Christian Heimes in branch 'main':
    bpo-47182: Fix crash by named unicode characters after interpreter reinitialization (GH-32212)
    44e9150

    @tiran tiran removed 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels Mar 31, 2022
    @tiran
    Copy link
    Member Author

    tiran commented Apr 1, 2022

    New changeset 55d5c96 by Christian Heimes in branch '3.10':
    [3.10] bpo-47182: Fix crash by named unicode characters after interpreter reinitialization (GH-32212) (GH-32216)
    55d5c96

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @kumaraditya303
    Copy link
    Contributor

    Fixed by #32212

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants