Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_PyDict_GetItem_KnownHash ignores DKIX_ERROR return #72310

Closed
zhangyangyu opened this issue Sep 13, 2016 · 22 comments
Closed

_PyDict_GetItem_KnownHash ignores DKIX_ERROR return #72310

zhangyangyu opened this issue Sep 13, 2016 · 22 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@zhangyangyu
Copy link
Member

BPO 28123
Nosy @rhettinger, @vstinner, @methane, @serhiy-storchaka, @zhangyangyu
Files
  • issue28123.patch
  • issue28123_v2.patch
  • issue28123_v3.patch
  • issue28123_v4.patch
  • issue28123_v5.patch
  • issue28123_v6.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-11-06.11:20:24.647>
    created_at = <Date 2016-09-13.09:51:18.873>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = '_PyDict_GetItem_KnownHash ignores DKIX_ERROR return'
    updated_at = <Date 2016-11-06.13:15:14.772>
    user = 'https://github.com/zhangyangyu'

    bugs.python.org fields:

    activity = <Date 2016-11-06.13:15:14.772>
    actor = 'vstinner'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-11-06.11:20:24.647>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2016-09-13.09:51:18.873>
    creator = 'xiang.zhang'
    dependencies = []
    files = ['44625', '44655', '44713', '45291', '45299', '45336']
    hgrepos = []
    issue_num = 28123
    keywords = ['patch']
    message_count = 22.0
    messages = ['276230', '276232', '276233', '276237', '276239', '276242', '276243', '276246', '276403', '276533', '276681', '276706', '276708', '276796', '279768', '279780', '279985', '279987', '279991', '279996', '280135', '280138']
    nosy_count = 6.0
    nosy_names = ['rhettinger', 'vstinner', 'methane', 'python-dev', 'serhiy.storchaka', 'xiang.zhang']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue28123'
    versions = ['Python 3.6', 'Python 3.7']

    @zhangyangyu
    Copy link
    Member Author

    _PyDict_GetItem_KnownHash should handle dk_lookup return value the same as PyDict_GetItem.

    BTW, it seems PyDict_GetItem can call _PyDict_GetItem_KnownHash to remove duplicate code, if you like, maybe another issue?

    diff -r 6acd2b575a3c Objects/dictobject.c
    --- a/Objects/dictobject.c	Tue Sep 13 07:56:45 2016 +0300
    +++ b/Objects/dictobject.c	Tue Sep 13 17:46:08 2016 +0800
    @@ -1370,12 +1370,12 @@
             ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr, NULL);
             /* ignore errors */
             PyErr_Restore(err_type, err_value, err_tb);
    -        if (ix == DKIX_EMPTY)
    +        if (ix < 0)
                 return NULL;
         }
         else {
             ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr, NULL);
    -        if (ix == DKIX_EMPTY) {
    +        if (ix < 0) {
                 PyErr_Clear();
                 return NULL;

    @zhangyangyu zhangyangyu added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Sep 13, 2016
    @vstinner
    Copy link
    Member

    Please create a patch file and attach it to the issue, so we can review it more easily.

    @zhangyangyu
    Copy link
    Member Author

    Hmm, I thought this is trivial so I didn't. Now upload the file patch ;).

    @vstinner
    Copy link
    Member

    I understand the the code doesn't handle correctly lookup failures. Such failure is easy to trigger in pure Python using a custom __eq__() method for example. Can you please write an unit test for it?

    @zhangyangyu
    Copy link
    Member Author

    _PyDict_GetItem_KnownHash is not invoked by any other dict methods. So to achieve it in pure Python level, we have to rely on others modules and objects such as OrderedDict, lru_cache. Is it a good idea to rely on those?

    @vstinner
    Copy link
    Member

    _PyDict_GetItem_KnownHash is not invoked by any other dict methods.

    Oh, I missed that. In this case, I suggest you to expose the function at Python level using the _testcapi module. And then use _testcapi._PyDict_GetItem_KnownHash() in test_dict.

    It would be nice to have unit tests on _PyDict_GetItem_KnownHash(), this function starts to become important in Python.

    @zhangyangyu
    Copy link
    Member Author

    How about let PyDict_GetItem call it? Just like the relationship of delitem and delitem_knownhash. You can see they share most codes. If we do that, it seems we can easily write a test(or there has already been a test) for it.

    @vstinner
    Copy link
    Member

    Xiang Zhang added the comment:

    How about let PyDict_GetItem call it?

    PyDict_GetItem() is like the most important function in term of
    performance for Python. If you want to touch it, you must benchmark
    it.

    I would prefer to keep it as it is.

    Just like the relationship of delitem and delitem_knownhash.

    delitem is less important in term of performance, so I decided to
    accept Naoki's change to merge duplicated code.

    @zhangyangyu
    Copy link
    Member Author

    Update the patch with unittest.

    @vstinner
    Copy link
    Member

    If __eq__() raise an exception, _PyDict_GetItem_KnownHash() currently returns NULL and pass through the exception. To me, it looks like the correct behaviour.

    With your patch, it looks like the _PyDict_GetItem_KnownHash() clears the __eq__() exception: return NULL with no exception set, as if the key is simply missing.

    PyDict_GetItem() ignores *almost* all exceptions, but for bad reasons:

    /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors

    • that may occur (originally dicts supported only string keys, and exceptions
    • weren't possible). (...) */

    I would prefer to not ignore __eq__ exceptions, but pass them through.

    To be clear: this is a behaviour change compared to Python 3.5 which works as PyDict_GetItem(), ignore *all* exceptions:

            ep = (mp->ma_keys->dk_lookup)(mp, key, hash, &value_addr);
            if (ep == NULL) {
                PyErr_Clear();
                return NULL;
            }

    I consider that it's ok to change _PyDict_GetItem_KnownHash() behaviour because this function is private and only used 4 times in 250k lines of C code.

    Would you be interested to write a different patch to pass through the exception?

    Note: It seems like _count_elements() (Modules/_collectionsmodule.c) doesn't handle correctly such error.

    @zhangyangyu
    Copy link
    Member Author

    Yes, ignoring exceptions is due to historical reasons. Although it's used rarely but I am still afraid changing it may break knowledge of devs that are already familiar with dict APIs. And what's more is there already exists two versions of getitem, one version with no exceptions and one version propagates exceptions (witherror), maybe we can also introduce a _PyDict_GetItem_KnownHashWithError?

    @vstinner
    Copy link
    Member

    Xiang Zhang: "I am still afraid changing it may break knowledge of
    devs that are already familiar with dict APIs"

    Come on, the function is only called 4 times in the whole code base,
    and the function is private. Private means that we are free to break
    its behaviour anytime, it's part of the contract.

    @zhangyangyu
    Copy link
    Member Author

    Hah, Okay. I'll make the corresponding change then.

    @zhangyangyu
    Copy link
    Member Author

    Victor, v3 now applies your suggestion.

    @serhiy-storchaka
    Copy link
    Member

    Actually ignoring exceptions in _PyDict_GetItem_KnownHash causes a subtle difference between Python and C implementations. Making _PyDict_GetItem_KnownHash not ignoring exceptions looks right thing.

    But dict_merge expects that _PyDict_GetItem_KnownHash don't raise an exception.

    @zhangyangyu
    Copy link
    Member Author

    dict_merge was altered after the patch. I make it ignore explicitly the error now, to not affect former behaviour.

    Serhiy, I apply your suggestion to use _PyLong_AsByteArray for Py_hash_t, but I am not familiar with the API. It needs a review.

    @serhiy-storchaka
    Copy link
    Member

    If _PyDict_GetItem_KnownHash() returns an error, it is very likely that following insertdict() with same key will return an error. I would prefer to return an error right after _PyDict_GetItem_KnownHash() is failed. This would look more straightforward.

    @zhangyangyu
    Copy link
    Member Author

    If _PyDict_GetItem_KnownHash() returns an error, it is very likely that following insertdict() with same key will return an error.

    Make sense.

    @serhiy-storchaka
    Copy link
    Member

    LGTM. I'll commit the patch soon if there are no comments from other core developers.

    @methane
    Copy link
    Member

    methane commented Nov 3, 2016

    LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 6, 2016

    New changeset d06a6b0fd992 by Serhiy Storchaka in branch '3.6':
    Issue bpo-28123: _PyDict_GetItem_KnownHash() now can raise an exception as
    https://hg.python.org/cpython/rev/d06a6b0fd992

    New changeset 805467de22fc by Serhiy Storchaka in branch 'default':
    Issue bpo-28123: _PyDict_GetItem_KnownHash() now can raise an exception as
    https://hg.python.org/cpython/rev/805467de22fc

    @vstinner
    Copy link
    Member

    vstinner commented Nov 6, 2016

    Sorry, I didn't have time to review this issue :-(

    Thanks Xiang and Serhiy for fixing the issue! I prefer the new API (don't
    ignore the error).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants