classification
Title: Enum creation non-linear in the number of values
Type: performance Stage: backport needed
Components: Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: Carl.Friedrich.Bolz, arigo, barry, eli.bendersky, ethan.furman, mattip, olliemath
Priority: normal Keywords: patch

Created on 2021-10-08 23:52 by olliemath, last changed 2021-10-15 16:07 by ethan.furman.

Files
File name Uploaded Description Edit
huge.py olliemath, 2021-10-08 23:52 A single very large enum
quadratic-fix-2.diff mattip, 2021-10-10 14:56
Pull Requests
URL Status Linked Edit
PR 28907 merged Carl.Friedrich.Bolz, 2021-10-12 18:47
Messages (5)
msg403512 - (view) Author: Oliver Margetts (olliemath) Date: 2021-10-08 23:52
Creating large enums takes a significant amount of time. Moreover this appears to be nonlinear in the number of entries in the enum. Locally, importing a single python file and taking this to the extreme:

 1000 entries - 0.058s
10000 entries - 4.327s

This is partially addressed by https://bugs.python.org/issue38659 and I can confirm that using `@_simple_enum` does not have this problem. But it seems like that API is only intended for internal use and the 'happy path' for user-defined enums is still not good.

Note that it is not simply parsing the file / creating the instances, it is to do with the cardinality. Creating 100 enums with 100 entries each is far faster than a single 10000 entry enum.
msg403585 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2021-10-10 13:12
The timing is clearly quadratic:

number of attributes     time
1500                     0.24s
3000                     0.94s
6000                     3.74s
12000                    15.57s

Pressing Ctrl-C in the middle of the execution of the largest examples points directly to the cause: when we consider the next attribute, we loop over all previous ones at enum.py:238.
msg403592 - (view) Author: mattip (mattip) * Date: 2021-10-10 14:56
Over at PyPy arigo authored and we applied this fix to the as-yet unreleased pypy3.8. Note that it cannot be applied directly to CPython as sets are not ordered. Does the fix break anything? Tests in Lib/test/test_enum.py passed after applying it.
msg403751 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2021-10-12 18:48
I fixed the reliance of set being insertion ordered in pypy and opened a pull request.
msg403947 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-10-14 20:59
New changeset b2af211e229df941d9b404f69547a264115156b5 by Carl Friedrich Bolz-Tereick in branch 'main':
bpo-45417: [Enum] fix quadratic behavior during creation (GH-28907)
https://github.com/python/cpython/commit/b2af211e229df941d9b404f69547a264115156b5
History
Date User Action Args
2021-10-15 16:07:46ethan.furmansetassignee: ethan.furman
stage: patch review -> backport needed
versions: - Python 3.9
2021-10-14 20:59:55ethan.furmansetmessages: + msg403947
2021-10-12 18:48:35Carl.Friedrich.Bolzsetmessages: + msg403751
2021-10-12 18:47:05Carl.Friedrich.Bolzsetnosy: + Carl.Friedrich.Bolz

pull_requests: + pull_request27198
stage: patch review
2021-10-10 15:57:11kjsetnosy: + barry, eli.bendersky, ethan.furman

versions: + Python 3.9, - Python 3.7
2021-10-10 14:56:55mattipsetfiles: + quadratic-fix-2.diff

nosy: + mattip
messages: + msg403592

keywords: + patch
2021-10-10 13:12:47arigosetnosy: + arigo
messages: + msg403585
2021-10-08 23:52:33olliemathcreate