classification
Title: Use PyDict_GetItemWithError() instead of PyDict_GetItem()
Type: Stage: resolved
Components: Extension Modules, Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, inada.naoki, josh.r, pablogsal, raphaelm, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2018-12-11 10:54 by serhiy.storchaka, last changed 2019-10-29 18:39 by raphaelm. This issue is now closed.

Files
File name Uploaded Description Edit
pydict-getitem-compare.txt serhiy.storchaka, 2019-02-14 16:37
Pull Requests
URL Status Linked Edit
PR 11112 merged serhiy.storchaka, 2018-12-11 10:56
PR 12706 merged serhiy.storchaka, 2019-04-06 08:45
Messages (14)
msg331604 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 10:54
There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object.

In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case.

The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError().
msg331605 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-11 10:56
My previous attempt: bpo-20615.
msg331621 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 13:31
Opened issue35461 for documenting flaws of PyDict_GetItem().
msg331744 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-13 07:13
Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance.

Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues.

The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053).
msg335541 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-14 16:37
Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random.
msg335651 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-15 22:20
Thanks, Serhiy.  While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results.

@Victor, any thoughts?
msg336536 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-25 15:59
New changeset a24107b04c1277e3c1105f98aff5bfa3a98b33a0 by Serhiy Storchaka in branch 'master':
bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem(). (GH-11112)
https://github.com/python/cpython/commit/a24107b04c1277e3c1105f98aff5bfa3a98b33a0
msg336563 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-25 21:47
It seems like the change introduced a regression: bpo-36110.
msg336647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-26 11:21
> It seems like the change introduced a regression: bpo-36110.

While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising.

The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7.

Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change!

Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation.
msg336686 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-02-26 15:51
#36110 was closed as a duplicate; the superseder is #36109 (which has been fixed). The change should still be documented, just in case anyone gets bitten by it.
msg339490 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-05 10:13
Serhiy, can this issue be closed?
msg339616 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-08 11:34
New changeset 7a0630c530121725136526a88c49589b54da6492 by Serhiy Storchaka in branch 'master':
Add a What's New entry for bpo-35459. (GH-12706)
https://github.com/python/cpython/commit/7a0630c530121725136526a88c49589b54da6492
msg339830 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-10 07:31
There are few occurrences of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() in cases where they are unlikely failed. These cases will be considered in separate issues.
msg355667 - (view) Author: Raphaël M (raphaelm) Date: 2019-10-29 18:39
I stumbled upon this issue while reading Python 3.8 and this made me curious.

I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? 


Also, the changelog states that "The CPython interpreter can swallow exceptions in some circumstances". Are there other documented cases of those circumstances?
History
Date User Action Args
2019-10-29 18:39:15raphaelmsetnosy: + raphaelm
messages: + msg355667
2019-04-10 07:31:08serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg339830

stage: patch review -> resolved
2019-04-08 11:34:08serhiy.storchakasetmessages: + msg339616
2019-04-06 08:45:28serhiy.storchakasetstage: resolved -> patch review
pull_requests: + pull_request12630
2019-04-05 10:13:10inada.naokisetnosy: + inada.naoki
messages: + msg339490
2019-02-26 15:51:26josh.rsetnosy: + josh.r
messages: + msg336686
2019-02-26 11:21:07vstinnersetstatus: closed -> open

nosy: + pablogsal
messages: + msg336647

resolution: fixed -> (no value)
2019-02-25 21:47:47vstinnersetmessages: + msg336563
2019-02-25 16:19:28serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-02-25 15:59:49serhiy.storchakasetmessages: + msg336536
2019-02-15 22:20:57eric.snowsetmessages: + msg335651
2019-02-14 16:37:04serhiy.storchakasetfiles: + pydict-getitem-compare.txt

messages: + msg335541
2018-12-13 07:39:36serhiy.storchakasetnosy: + rhettinger, eric.snow
2018-12-13 07:13:15serhiy.storchakasetmessages: + msg331744
2018-12-11 13:31:26serhiy.storchakasetmessages: + msg331621
2018-12-11 10:56:49serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request10341
2018-12-11 10:56:41vstinnersetmessages: + msg331605
2018-12-11 10:55:38serhiy.storchakasettitle: Use PyDict_GetItemWithError() with PyDict_GetItem() -> Use PyDict_GetItemWithError() instead of PyDict_GetItem()
2018-12-11 10:54:17serhiy.storchakacreate