Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PyDict_GetItemWithError() instead of PyDict_GetItem() #79640

Closed
serhiy-storchaka opened this issue Dec 11, 2018 · 17 comments
Closed

Use PyDict_GetItemWithError() instead of PyDict_GetItem() #79640

serhiy-storchaka opened this issue Dec 11, 2018 · 17 comments
Labels
3.8 only security fixes extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@serhiy-storchaka
Copy link
Member

BPO 35459
Nosy @rhettinger, @vstinner, @methane, @ericsnowcurrently, @serhiy-storchaka, @MojoVampire, @pablogsal, @rmonat
PRs
  • bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem() #11112
  • Add a What's New entry for bpo-35459. #12706
  • bpo-42093: Cleanup _PyDict_GetItemHint() #24582
  • Files
  • pydict-getitem-compare.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-04-10.07:31:08.468>
    created_at = <Date 2018-12-11.10:54:17.556>
    labels = ['extension-modules', 'interpreter-core', '3.8']
    title = 'Use PyDict_GetItemWithError() instead of PyDict_GetItem()'
    updated_at = <Date 2021-02-21.11:07:21.269>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-02-21.11:07:21.269>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-04-10.07:31:08.468>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules', 'Interpreter Core']
    creation = <Date 2018-12-11.10:54:17.556>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['48137']
    hgrepos = []
    issue_num = 35459
    keywords = ['patch']
    message_count = 17.0
    messages = ['331604', '331605', '331621', '331744', '335541', '335651', '336536', '336563', '336647', '336686', '339490', '339616', '339830', '355667', '359480', '359489', '359490']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'vstinner', 'methane', 'eric.snow', 'serhiy.storchaka', 'josh.r', 'pablogsal', 'raphaelm']
    pr_nums = ['11112', '12706', '24582']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue35459'
    versions = ['Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object.

    In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case.

    The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError().

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 11, 2018
    @serhiy-storchaka serhiy-storchaka changed the title Use PyDict_GetItemWithError() with PyDict_GetItem() Use PyDict_GetItemWithError() instead of PyDict_GetItem() Dec 11, 2018
    @vstinner
    Copy link
    Member

    My previous attempt: bpo-20615.

    @serhiy-storchaka
    Copy link
    Member Author

    Opened bpo-35461 for documenting flaws of PyDict_GetItem().

    @serhiy-storchaka
    Copy link
    Member Author

    Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance.

    Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues.

    The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053).

    @serhiy-storchaka
    Copy link
    Member Author

    Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random.

    @ericsnowcurrently
    Copy link
    Member

    Thanks, Serhiy. While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results.

    @victor, any thoughts?

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset a24107b by Serhiy Storchaka in branch 'master':
    bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem(). (GH-11112)
    a24107b

    @vstinner
    Copy link
    Member

    It seems like the change introduced a regression: bpo-36110.

    @vstinner
    Copy link
    Member

    It seems like the change introduced a regression: bpo-36110.

    While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising.

    The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7.

    Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change!

    Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation.

    @vstinner vstinner reopened this Feb 26, 2019
    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Feb 26, 2019

    bpo-36110 was closed as a duplicate; the superseder is bpo-36109 (which has been fixed). The change should still be documented, just in case anyone gets bitten by it.

    @methane
    Copy link
    Member

    methane commented Apr 5, 2019

    Serhiy, can this issue be closed?

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 7a0630c by Serhiy Storchaka in branch 'master':
    Add a What's New entry for bpo-35459. (GH-12706)
    7a0630c

    @serhiy-storchaka
    Copy link
    Member Author

    There are few occurrences of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() in cases where they are unlikely failed. These cases will be considered in separate issues.

    @rmonat
    Copy link
    Mannequin

    rmonat mannequin commented Oct 29, 2019

    I stumbled upon this issue while reading Python 3.8 and this made me curious.

    I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue?

    Also, the changelog states that "The CPython interpreter can swallow exceptions in some circumstances". Are there other documented cases of those circumstances?

    @rmonat
    Copy link
    Mannequin

    rmonat mannequin commented Jan 6, 2020

    Any pointer would also be welcome, and a piece of code showing the bug would help me a lot. Thank you.

    @methane
    Copy link
    Member

    methane commented Jan 7, 2020

    I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue?

    Note that this is a bug from long ago. Why this bug had lived long is it can not happen in regular cases. So it is difficult to reproduce.

    See PR 11112. _csv module is changed to use PyDict_GetItemWithError.
    Let's try it on Python 3.7.

    Python 3.7.6 (default, Dec 30 2019, 19:38:28)
    [Clang 11.0.0 (clang-1100.0.33.16)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> class S(str):
    ...   def __hash__(self):
    ...     raise MemoryError
    ...
    >>> import _csv
    >>> _csv.Dialect(S("excel"))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _csv.Error: unknown dialect

    You can see the MemoryError is suppressed. Let's try it on Python 3.8.

    $ python3
    Python 3.8.1 (default, Jan  6 2020, 16:02:33)
     (snip)
    >>> _csv.Dialect(S("excel"))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in __hash__
    MemoryError

    You can see the MemoryError is not suppressed.

    @rmonat
    Copy link
    Mannequin

    rmonat mannequin commented Jan 7, 2020

    Thank you very much!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants