Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elementtree calls PyDict_GetItem without owning a reference to the dict #72133

Closed
tehybel mannequin opened this issue Sep 2, 2016 · 9 comments
Closed

elementtree calls PyDict_GetItem without owning a reference to the dict #72133

tehybel mannequin opened this issue Sep 2, 2016 · 9 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes extension-modules C modules in the Modules dir

Comments

@tehybel
Copy link
Mannequin

tehybel mannequin commented Sep 2, 2016

BPO 27946
Nosy @serhiy-storchaka, @tim-mitchell, @corona10, @miss-islington, @iritkatriel
PRs
  • bpo-27946: Fix possible crash in ElementTree.Element #29915
  • [3.10] bpo-27946: Fix possible crash in ElementTree.Element (GH-29915) #29919
  • [3.9] bpo-27946: Fix possible crash in ElementTree.Element (GH-29915) #29920
  • Files
  • 0001-issue27946-Hold-reference-to-dict-in-PyDict_GetItem.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-12-05.19:07:13.242>
    created_at = <Date 2016-09-02.21:48:01.169>
    labels = ['extension-modules', '3.9', '3.10', '3.11']
    title = 'elementtree calls PyDict_GetItem without owning a reference to the dict'
    updated_at = <Date 2021-12-05.19:07:13.241>
    user = 'https://bugs.python.org/tehybel'

    bugs.python.org fields:

    activity = <Date 2021-12-05.19:07:13.241>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-12-05.19:07:13.242>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules']
    creation = <Date 2016-09-02.21:48:01.169>
    creator = 'tehybel'
    dependencies = []
    files = ['44611']
    hgrepos = []
    issue_num = 27946
    keywords = ['patch']
    message_count = 8.0
    messages = ['274276', '276128', '276140', '407553', '407582', '407681', '407705', '407730', '407731']
    nosy_count = 6.0
    nosy_names = ['serhiy.storchaka', 'tehybel', 'Tim Mitchell', 'corona10', 'miss-islington', 'iritkatriel']
    pr_nums = ['29915', '29919', '29920']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue27946'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @tehybel
    Copy link
    Mannequin Author

    tehybel mannequin commented Sep 2, 2016

    I would like to describe an issue in the _elementtree module, and then
    propose a fix which would prevent this type of bug everywhere in the
    codebase.

    The issue exists in _elementtree_Element_get_impl in
    /Modules/_elementtree.c. Here is the code:

        static PyObject *
        _elementtree_Element_get_impl(ElementObject *self, PyObject *key,
                                      PyObject *default_value)
        {
            PyObject* value;
    
            if (...)
                value = default_value;
            else {
                value = PyDict_GetItem(self->extra->attrib, key);
                ...
            }
            ...
        }

    We look up "key" in the dictionary, that is, in self->extra->attrib. This is
    done with the call to PyDict_GetItem(self->extra->attrib, key).

    We need to hash the "key" object in order to find it in the dictionary. This
    sometimes requires calling the key's __hash__ function, i.e., it might call on
    to python code.

    What happens if the python code gets the dictionary (self->extra->attrib) freed?
    Then PyDict_GetItem will use it after it has been freed.

    I've attached a proof-of-concept script which causes a use-after-free through
    _elementtree_Element_get_impl due to this issue.

    We could fix this by calling Py_INCREF on self->extra->attrib before calling
    PyDict_GetItem. But grepping for "PyDict_GetItem.*\->" shows that there are many
    places in the codebase where this situation occurs. Some of these may not have
    much impact, but some of them likely will.

    All these bugs could be fixed at once by changing PyDict_GetItem to call
    Py_INCREF on the dictionary before it hashes the key.

    Here's a PoC for the _elementtree module.

    --- begin script ---

    import _elementtree as et
    
    state = {
        "tag": "tag",
        "_children": [],
        "attrib": "attr",
        "tail": "tail",
        "text": "text",
    }
    
    class X:
        def __hash__(self):
            e.__setstate__(state) # this frees e->extra->attrib
            return 13
    
    e = et.Element("elem", {1: 2})
    e.get(X())

    --- end script ---

    Running it (64-bits Python 3.5.2, --with-pydebug) causes a use-after-free with
    control of the program counter:

    (gdb) r ./poc13.py
    Starting program: /home/xx/Python-3.5.2/python ./poc13.py

    Program received signal SIGSEGV, Segmentation fault.
    0x00000000004939af in PyDict_GetItem (op=0x7ffff6d5b1a8, key=0x7ffff6d528e0) at Objects/dictobject.c:1079
    1079 ep = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr);
    (gdb) p mp->ma_keys->dk_lookup
    $8 = (dict_lookup_func) 0x7b7b7b7b7b7b7b7b

    @tehybel tehybel mannequin added the extension-modules C modules in the Modules dir label Sep 2, 2016
    @tim-mitchell
    Copy link
    Mannequin

    tim-mitchell mannequin commented Sep 12, 2016

    Verified problem. Added patch to keep reference to dict in PyDict_GetItem.

    @tim-mitchell
    Copy link
    Mannequin

    tim-mitchell mannequin commented Sep 13, 2016

    Added test to patch

    @iritkatriel
    Copy link
    Member

    This is a bug in elementtree - the caller should own a reference to the dict during the entire PyDict_GetItem call.

    We won't add the refcount change you propose because it's not free and this is a hot function.

    The test in your patch doesn't fail for me (I tried on version 3.11 on a Mac). Are you still seeing the problem on version >= 3.9, or perhaps it was fixed in eaementtree by now?

    @iritkatriel iritkatriel changed the title issues in elementtree and elsewhere due to PyDict_GetItem elementtree calls PyDict_GetItem without owning a reference to the dict Dec 2, 2021
    @serhiy-storchaka
    Copy link
    Member

    Concur with Irit. We should patch the caller of PyDict_GetItem, not PyDict_GetItem.

    @serhiy-storchaka
    Copy link
    Member

    The issue is no longer reproduced by the original test because of the cache for dict key tables. But it is not gone, and can be reproduced with modified test.

    There may be many similar bugs in the Python core end extensions. Adding incref/decref in PyDict_GetItem and similar C API functions could fix many of them, but perhaps not all, and it would hit performance. I suppose modt of uses of PyDict_GetItem are safe.

    @serhiy-storchaka serhiy-storchaka added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes labels Dec 4, 2021
    @serhiy-storchaka
    Copy link
    Member

    New changeset d15cdb2 by Serhiy Storchaka in branch 'main':
    bpo-27946: Fix possible crash in ElementTree.Element (GH-29915)
    d15cdb2

    @miss-islington
    Copy link
    Contributor

    New changeset beb8342 by Miss Islington (bot) in branch '3.10':
    bpo-27946: Fix possible crash in ElementTree.Element (GH-29915)
    beb8342

    @miss-islington
    Copy link
    Contributor

    New changeset 52a9a71 by Miss Islington (bot) in branch '3.9':
    bpo-27946: Fix possible crash in ElementTree.Element (GH-29915)
    52a9a71

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes extension-modules C modules in the Modules dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants