This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python 3.9 regression: Literal dict with > 65535 items are one item shorter
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, batuhanosmantaskaya, eric.smith, hroncok, lukasz.langa, miss-islington, mrabarnett, pablogsal, serhiy.storchaka, xtreak, zbysz
Priority: release blocker Keywords: 3.9regression, patch

Created on 2020-08-12 15:42 by hroncok, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 21850 merged pablogsal, 2020-08-12 18:33
PR 21853 closed miss-islington, 2020-08-13 08:49
PR 22105 closed miss-islington, 2020-09-04 22:34
PR 22107 merged miss-islington, 2020-09-04 22:38
Messages (9)
msg375255 - (view) Author: Miro Hrončok (hroncok) * Date: 2020-08-12 15:42
Consider this reproducer.py:

import sys
LEN = int(sys.argv[1])

with open('big_dict.py', 'w') as f:
    print('INTS = {', file=f)
    for i in range(LEN):
        print(f'    {i}: None,', file=f)
    print('}', file=f)


import big_dict
assert len(big_dict.INTS) == LEN, len(big_dict.INTS)



And run it with any number > 65535:

$ python3.9 reproducer.py 65536
Traceback (most recent call last):
  File "/tmp/reproducer.py", line 12, in <module>
    assert len(big_dict.INTS) == LEN, len(big_dict.INTS)
AssertionError: 65535


This has not happened on python 3.8. This also happens with PYTHONOLDPARSER=1.
msg375256 - (view) Author: Zbyszek Jędrzejewski-Szmek (zbysz) * Date: 2020-08-12 15:45
Also reproduces with today's git.
msg375257 - (view) Author: Miro Hrončok (hroncok) * Date: 2020-08-12 15:50
It appears that the 65535 key is missing regardless of the LEN value.
msg375259 - (view) Author: Zbyszek Jędrzejewski-Szmek (zbysz) * Date: 2020-08-12 16:08
Bisect says 8a4cd700a7426341c2074a2b580306d2d60ec839 is the first bad commit. Considering that 0xFFFF appears a few times in that patch, that seems plausible ;)
msg375275 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2020-08-12 18:07
I think what's happening is that in 'compiler_dict' (Python/compile.c), it's checking whether 'elements' has reached a maximum (0xFFFF). However, it's not doing this after incrementing; instead, it's checking before incrementing and resetting 'elements' to 0 when it should be resetting to 1. The 65535th element isn't counted.
msg375295 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-08-13 08:48
New changeset c51db0ea40ddabaf5f771ea633b37fcf4c90a495 by Pablo Galindo in branch 'master':
bpo-41531: Fix compilation of dict literals with more than 0xFFFF elements (GH-21850)
https://github.com/python/cpython/commit/c51db0ea40ddabaf5f771ea633b37fcf4c90a495
msg375296 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-08-13 09:00
@hroncok,

How did you discover this issue?

I'd like to clean up the code for creating dictionary literals and it might be helpful to know where such huge dictionary literals exist.
I'm guessing that they are used as lookup tables for things like Unicode code-point tables, and that they would only include constants.
msg375297 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-08-13 09:03
@hroncok said on Twitter it was reported at https://github.com/Storyyeller/enjarify/issues/17
msg376417 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-09-04 23:38
New changeset d64d78be20ced6ac9de58e91e69eaba184e36e9b by Miss Islington (bot) in branch '3.9':
bpo-41531: Fix compilation of dict literals with more than 0xFFFF elements (GH-21850) (GH-22107)
https://github.com/python/cpython/commit/d64d78be20ced6ac9de58e91e69eaba184e36e9b
History
Date User Action Args
2022-04-11 14:59:34adminsetnosy: + lukasz.langa
github: 85703
2020-09-04 23:39:20pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-09-04 23:38:58pablogsalsetmessages: + msg376417
2020-09-04 22:38:51miss-islingtonsetpull_requests: + pull_request21192
2020-09-04 22:34:39miss-islingtonsetpull_requests: + pull_request21190
2020-08-13 09:03:38xtreaksetmessages: + msg375297
2020-08-13 09:00:13Mark.Shannonsetmessages: + msg375296
2020-08-13 08:49:09miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request20980
2020-08-13 08:48:53Mark.Shannonsetmessages: + msg375295
2020-08-12 18:34:18batuhanosmantaskayasetnosy: + batuhanosmantaskaya
2020-08-12 18:33:26pablogsalsetkeywords: + patch
nosy: + pablogsal

pull_requests: + pull_request20978
stage: patch review
2020-08-12 18:07:00mrabarnettsetnosy: + mrabarnett
messages: + msg375275
2020-08-12 17:22:00serhiy.storchakasetnosy: + serhiy.storchaka
components: + Interpreter Core
2020-08-12 16:52:27eric.smithsetnosy: + eric.smith
2020-08-12 16:22:24vstinnersetkeywords: + 3.9regression
2020-08-12 16:22:18vstinnersetpriority: normal -> release blocker
2020-08-12 16:17:22hroncoksetversions: + Python 3.10
2020-08-12 16:09:09ammar2setnosy: + Mark.Shannon
2020-08-12 16:08:28zbyszsetmessages: + msg375259
2020-08-12 15:56:21xtreaksetnosy: + xtreak
2020-08-12 15:50:19hroncoksetmessages: + msg375257
2020-08-12 15:45:10zbyszsetnosy: + zbysz
messages: + msg375256
2020-08-12 15:42:56hroncokcreate