classification
Title: Use PyDict_GetItemWithError() instead of PyDict_GetItem()
Type: Stage: resolved
Components: Extension Modules, Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, josh.r, pablogsal, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2018-12-11 10:54 by serhiy.storchaka, last changed 2019-02-26 15:51 by josh.r.

Files
File name Uploaded Description Edit
pydict-getitem-compare.txt serhiy.storchaka, 2019-02-14 16:37
Pull Requests
URL Status Linked Edit
PR 11112 merged serhiy.storchaka, 2018-12-11 10:56
Messages (10)
msg331604 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 10:54
There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object.

In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case.

The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError().
msg331605 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-11 10:56
My previous attempt: bpo-20615.
msg331621 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 13:31
Opened issue35461 for documenting flaws of PyDict_GetItem().
msg331744 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-13 07:13
Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance.

Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues.

The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053).
msg335541 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-14 16:37
Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random.
msg335651 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-15 22:20
Thanks, Serhiy.  While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results.

@Victor, any thoughts?
msg336536 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-25 15:59
New changeset a24107b04c1277e3c1105f98aff5bfa3a98b33a0 by Serhiy Storchaka in branch 'master':
bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem(). (GH-11112)
https://github.com/python/cpython/commit/a24107b04c1277e3c1105f98aff5bfa3a98b33a0
msg336563 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-25 21:47
It seems like the change introduced a regression: bpo-36110.
msg336647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-26 11:21
> It seems like the change introduced a regression: bpo-36110.

While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising.

The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7.

Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change!

Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation.
msg336686 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-02-26 15:51
#36110 was closed as a duplicate; the superseder is #36109 (which has been fixed). The change should still be documented, just in case anyone gets bitten by it.
History
Date User Action Args
2019-02-26 15:51:26josh.rsetnosy: + josh.r
messages: + msg336686
2019-02-26 11:21:07vstinnersetstatus: closed -> open

nosy: + pablogsal
messages: + msg336647

resolution: fixed ->
2019-02-25 21:47:47vstinnersetmessages: + msg336563
2019-02-25 16:19:28serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-02-25 15:59:49serhiy.storchakasetmessages: + msg336536
2019-02-15 22:20:57eric.snowsetmessages: + msg335651
2019-02-14 16:37:04serhiy.storchakasetfiles: + pydict-getitem-compare.txt

messages: + msg335541
2018-12-13 07:39:36serhiy.storchakasetnosy: + rhettinger, eric.snow
2018-12-13 07:13:15serhiy.storchakasetmessages: + msg331744
2018-12-11 13:31:26serhiy.storchakasetmessages: + msg331621
2018-12-11 10:56:49serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request10341
2018-12-11 10:56:41vstinnersetmessages: + msg331605
2018-12-11 10:55:38serhiy.storchakasettitle: Use PyDict_GetItemWithError() with PyDict_GetItem() -> Use PyDict_GetItemWithError() instead of PyDict_GetItem()
2018-12-11 10:54:17serhiy.storchakacreate