classification
Title: Use PyDict_GetItemWithError() instead of PyDict_GetItem()
Type: Stage: resolved
Components: Extension Modules, Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, josh.r, methane, pablogsal, raphaelm, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2018-12-11 10:54 by serhiy.storchaka, last changed 2020-01-07 06:42 by raphaelm. This issue is now closed.

Files
File name Uploaded Description Edit
pydict-getitem-compare.txt serhiy.storchaka, 2019-02-14 16:37
Pull Requests
URL Status Linked Edit
PR 11112 merged serhiy.storchaka, 2018-12-11 10:56
PR 12706 merged serhiy.storchaka, 2019-04-06 08:45
Messages (17)
msg331604 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 10:54
There is an issue with using PyDict_GetItem(). Since it silences all exceptions, it can return incorrect result when an exception like MemoryError or KeyboardInterrupt was raised in the user __hash__() and __eq__(). In addition PyDict_GetItemString() and _PyDict_GetItemId() swallow a MemoryError raised when fail to allocate a temporary string object.

In addition, PyDict_GetItemWithError() is a tiny bit faster than PyDict_GetItem(), because it avoids checking the exception state in successful case.

The proposed PR replaces most calls of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() with calls of PyDict_GetItemWithError(), _PyDict_GetItemStringWithError() and _PyDict_GetItemIdWithError().
msg331605 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-11 10:56
My previous attempt: bpo-20615.
msg331621 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-11 13:31
Opened issue35461 for documenting flaws of PyDict_GetItem().
msg331744 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-13 07:13
Most of changes are straightforward. Just replaced PyDict_GetItem*() with PyDict_GetItem*WithError() and added the check for PyErr_Occurred(). PyDict_GetItemString() with constant argument was replaced with _PyDict_GetItemIdWithError() for performance.

Some code was left unchanged. This was mostly in files where errors are very and error checking is not performed or errors are silenced in any case (Python/compile.c, Python/symtable.c, Objects/structseq.c, etc). These cases needed separate issues.

The most non-trivial change is in Objects/typeobject.c. The check for duplicated descriptors (in add_methods(), add_members() and add_getset()) was moved after creating the descriptor object. This improves performance by avoiding to create a temporary string objects. Duplicate descriptor names is a very uncommon case -- there were only two cases in the stdlib (besides tests), and one of them already is fixed (PR 11053).
msg335541 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-14 16:37
Eric requested to run the benchmark suite. Here are results. I do not know how to interpret them. Likely all differences are random.
msg335651 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-15 22:20
Thanks, Serhiy.  While the benchmark suite is our best tool available for measuring performance, I'm not sure what slowdown is significant in those results.

@Victor, any thoughts?
msg336536 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-25 15:59
New changeset a24107b04c1277e3c1105f98aff5bfa3a98b33a0 by Serhiy Storchaka in branch 'master':
bpo-35459: Use PyDict_GetItemWithError() instead of PyDict_GetItem(). (GH-11112)
https://github.com/python/cpython/commit/a24107b04c1277e3c1105f98aff5bfa3a98b33a0
msg336563 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-25 21:47
It seems like the change introduced a regression: bpo-36110.
msg336647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-26 11:21
> It seems like the change introduced a regression: bpo-36110.

While the test has been fixed, IMHO we need to better document this subtle behavior change since it can be surprising.

The test failed because PyObject_GetAttr() no longer ignores exceptions when getting the attribute from the type dictionary. It's a significant change compared to Python 3.7.

Right now, the change has not even a NEWS entry, whereas it's a backward incompatible change!

Don't get my wrong: the change is correct, we don't want to ignore arbitrary exceptions, it's just a matter of documentation.
msg336686 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-02-26 15:51
#36110 was closed as a duplicate; the superseder is #36109 (which has been fixed). The change should still be documented, just in case anyone gets bitten by it.
msg339490 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-04-05 10:13
Serhiy, can this issue be closed?
msg339616 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-08 11:34
New changeset 7a0630c530121725136526a88c49589b54da6492 by Serhiy Storchaka in branch 'master':
Add a What's New entry for bpo-35459. (GH-12706)
https://github.com/python/cpython/commit/7a0630c530121725136526a88c49589b54da6492
msg339830 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-10 07:31
There are few occurrences of PyDict_GetItem(), PyDict_GetItemString() and _PyDict_GetItemId() in cases where they are unlikely failed. These cases will be considered in separate issues.
msg355667 - (view) Author: Raphaël M (raphaelm) Date: 2019-10-29 18:39
I stumbled upon this issue while reading Python 3.8 and this made me curious.

I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? 


Also, the changelog states that "The CPython interpreter can swallow exceptions in some circumstances". Are there other documented cases of those circumstances?
msg359480 - (view) Author: Raphaël M (raphaelm) Date: 2020-01-06 22:58
Any pointer would also be welcome, and a piece of code showing the bug would help me a lot. Thank you.
msg359489 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-01-07 05:38
> I've tried writing some Python code to reproduce this bug, but I'm unable to -- I should be missing something. Is there a simple snippet showing the issue? 


Note that this is a bug from long ago.  Why this bug had lived long is it can not happen in regular cases.  So it is difficult to reproduce.

See PR 11112.  _csv module is changed to use PyDict_GetItemWithError.
Let's try it on Python 3.7.

Python 3.7.6 (default, Dec 30 2019, 19:38:28)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> class S(str):
...   def __hash__(self):
...     raise MemoryError
...
>>> import _csv
>>> _csv.Dialect(S("excel"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_csv.Error: unknown dialect

You can see the MemoryError is suppressed.  Let's try it on Python 3.8.

$ python3
Python 3.8.1 (default, Jan  6 2020, 16:02:33)
 (snip)
>>> _csv.Dialect(S("excel"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __hash__
MemoryError

You can see the MemoryError is not suppressed.
msg359490 - (view) Author: Raphaël M (raphaelm) Date: 2020-01-07 06:42
Thank you very much!
History
Date User Action Args
2020-01-07 06:42:07raphaelmsetmessages: + msg359490
2020-01-07 05:38:42methanesetmessages: + msg359489
2020-01-06 22:58:38raphaelmsetmessages: + msg359480
2019-10-29 18:39:15raphaelmsetnosy: + raphaelm
messages: + msg355667
2019-04-10 07:31:08serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg339830

stage: patch review -> resolved
2019-04-08 11:34:08serhiy.storchakasetmessages: + msg339616
2019-04-06 08:45:28serhiy.storchakasetstage: resolved -> patch review
pull_requests: + pull_request12630
2019-04-05 10:13:10methanesetnosy: + methane
messages: + msg339490
2019-02-26 15:51:26josh.rsetnosy: + josh.r
messages: + msg336686
2019-02-26 11:21:07vstinnersetstatus: closed -> open

nosy: + pablogsal
messages: + msg336647

resolution: fixed -> (no value)
2019-02-25 21:47:47vstinnersetmessages: + msg336563
2019-02-25 16:19:28serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-02-25 15:59:49serhiy.storchakasetmessages: + msg336536
2019-02-15 22:20:57eric.snowsetmessages: + msg335651
2019-02-14 16:37:04serhiy.storchakasetfiles: + pydict-getitem-compare.txt

messages: + msg335541
2018-12-13 07:39:36serhiy.storchakasetnosy: + rhettinger, eric.snow
2018-12-13 07:13:15serhiy.storchakasetmessages: + msg331744
2018-12-11 13:31:26serhiy.storchakasetmessages: + msg331621
2018-12-11 10:56:49serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request10341
2018-12-11 10:56:41vstinnersetmessages: + msg331605
2018-12-11 10:55:38serhiy.storchakasettitle: Use PyDict_GetItemWithError() with PyDict_GetItem() -> Use PyDict_GetItemWithError() instead of PyDict_GetItem()
2018-12-11 10:54:17serhiy.storchakacreate