classification
Title: enum classes cause slow startup time
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: ammar2, arigo, ethan.furman, gregory.p.smith, pablogsal, vstinner
Priority: release blocker Keywords: patch

Created on 2019-11-01 01:12 by gregory.p.smith, last changed 2021-10-12 19:31 by ethan.furman. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25285 merged ethan.furman, 2021-04-08 19:44
PR 25476 merged ethan.furman, 2021-04-20 01:07
PR 25497 merged ethan.furman, 2021-04-21 13:59
PR 25516 merged ammar2, 2021-04-22 03:04
PR 25560 closed pablogsal, 2021-04-23 17:07
PR 25566 merged ethan.furman, 2021-04-23 22:08
Messages (23)
msg355777 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-11-01 01:12
Creating an enum subclass (ie: defining an enum) is slow.  This dramatically impacts startup time of Python programs that import a bunch of potentially needed constant definitions at startup before any proper code executes.

How slow?  So slow that a module defining a ~300 enums takes nearly 100ms just to import from its pyc file.

Example code: https://github.com/googleads/google-ads-python/blob/96fd08bb62435f1930df4871033ba8689333b67f/google/ads/google_ads/v2/services/enums.py

We've known this, we should do something about it.  (Even if it means implementing the guts of the magic enum machinery in C.)  ie, it came up in https://bugs.python.org/issue28637 as a stdlib startup time regression and is likely to come up in similar contexts elsewhere.
msg355815 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2019-11-01 13:22
I was just looking at this problem, and creating a bare-bones, no safety belts version for use in the stdlib (no patch yet) which decreases Enum creation from 14x slower to only 6x slower.  (Comparing to a class with simple attributes.)

Not sure if that's enough improvement, though.

If it needs to be even faster, a C version of that simplified Enum shouldn't be too hard.  Anyone that uses the _simple_enum, though, should have a test that uses the full Enum and compares the two to make sure nothing got lost in translation.
msg391420 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-20 01:05
New changeset dbac8f40e81eb0a29dc833e6409a1abf47467da6 by Ethan Furman in branch 'master':
bpo-38659: [Enum] add _simple_enum decorator (GH-25285)
https://github.com/python/cpython/commit/dbac8f40e81eb0a29dc833e6409a1abf47467da6
msg391421 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-20 02:12
New changeset 503cdc7c124cebbd777008bdf7bd9aa666b25f07 by Ethan Furman in branch 'master':
Revert "bpo-38659: [Enum] add _simple_enum decorator (GH-25285)" (GH-25476)
https://github.com/python/cpython/commit/503cdc7c124cebbd777008bdf7bd9aa666b25f07
msg391532 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-21 17:20
New changeset a02cb474f9c097c83cd444a47e9fb5f99b4aaf45 by Ethan Furman in branch 'master':
bpo-38659: [Enum] add _simple_enum decorator (GH-25497)
https://github.com/python/cpython/commit/a02cb474f9c097c83cd444a47e9fb5f99b4aaf45
msg391565 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-22 00:08
Commit a02cb474f9c097c83cd444a47e9fb5f99b4aaf45 fails to build in the refleak buildbots:

https://buildbot.python.org/all/#/builders/75/builds/2/steps/5/logs/stdio

Example failure:

❯ ./python  -m test test_enum -R :
0:00:00 load avg: 1.81 Run tests sequentially
0:00:00 load avg: 1.81 [1/1] test_enum
beginning 9 repetitions
123456789
.test test_enum failed -- Traceback (most recent call last):
  File "/home/pablogsal/github/python/master/Lib/test/test_enum.py", line 3700, in test_convert_repr_and_str
    self.assertEqual(format(test_type.CONVERT_STRING_TEST_NAME_A), '5')
AssertionError: 'CONVERT_STRING_TEST_NAME_A' != '5'
- CONVERT_STRING_TEST_NAME_A
+ 5


test_enum failed

== Tests result: FAILURE ==

1 test failed:
    test_enum

Total duration: 586 ms
Tests result: FAILURE
msg391567 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-22 00:11
Can someone take a look?

As per the buildbot policy (https://discuss.python.org/t/policy-to-revert-commits-on-buildbot-failure/404) we may need to revert it if is not fixed in 24 because of the risk of masking errors.
msg391568 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-04-22 02:44
Looks like this is the issue described in the comment here: https://github.com/python/cpython/blob/master/Lib/test/test_enum.py#L3691-L3692

On the first run you have the correct

  ('CONVERT_STRING_TEST_NAME_A', 5)

but later it turns into

  ('CONVERT_STRING_TEST_NAME_A', test.test_enum.CONVERT_STRING_TEST_NAME_A)

causing double-conversions of the enum elements. This causes the format(x) test to fail. You can re-create the same issue outside of the refleak by adding a simple:

    def test_convert_repr_and_str_again(self):
        self.test_convert_repr_and_str()

method.
msg391569 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-22 03:18
`format` issue resolved, but not the refleak issue.
msg391570 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-22 03:23
New changeset 37b173c5231b7b202859af97c17bc3d04b1e4c75 by Ammar Askar in branch 'master':
bpo-38659: Properly re-intialize module variables in test_enum (GH-25516)
https://github.com/python/cpython/commit/37b173c5231b7b202859af97c17bc3d04b1e4c75
msg391571 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-22 04:06
Actually, I think that fixed the refleak issue as well.

Thanks, Ammar!
msg391604 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-22 15:30
def setUp(self):
        # Reset the module-level test variables to their original integer
        # values, otherwise the already created enum values get converted
        # instead.

Why not doing that in a tearDown() method instead? What if you run explicitly a single test method?
msg391621 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-22 18:44
Thanks Ammar for the fix. Unfortunately there is still some failures related to this on test_socker:

======================================================================
ERROR: test_msgflag_enum (test.test_socket.GeneralModuleTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.refleak/build/Lib/test/test_socket.py", line 1969, in test_msgflag_enum
    enum._test_simple_enum(CheckedMsgFlag, socket.MsgFlag)
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.refleak/build/Lib/enum.py", line 1664, in _test_simple_enum
    raise TypeError('enum mismatch:\n   %s' % '\n   '.join(failed))
TypeError: enum mismatch:
   'MSG_TRUNC' member mismatch:
      extra key '_inverted_' in simple enum member 'MSG_TRUNC'
   'MSG_CTRUNC' member mismatch:
      extra key '_inverted_' in simple enum member 'MSG_CTRUNC'
----------------------------------------------------------------------
msg391678 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-23 10:57
Gentle ping
msg391679 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-23 11:13
Commenting out the enum tests reveals that test_socket has additional problems:

𓋹 ./python.exe -m test test_socket  -R 3:3
0:00:00 load avg: 2.32 Run tests sequentially
0:00:00 load avg: 2.32 [1/1] test_socket
beginning 6 repetitions
123456
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
./Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2701: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg(bufsize, *args)
/Users/pgalindo3/github/python/master/Lib/test/test_socket.py:2792: RuntimeWarning: received malformed or improperly-truncated ancillary data
  result = sock.recvmsg_into([buf], *args)
.
test_socket leaked [20, 20, 20] file descriptors, sum=60
test_socket failed in 2 min 25 sec

== Tests result: FAILURE ==

1 test failed:
    test_socket

Total duration: 2 min 25 sec
Tests result: FAILURE
msg391711 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-23 17:08
Unfortunately, I am being forced to revert commit a02cb474f9c097c83cd444a47e9fb5f99b4aaf45 due to have it failing of all refleak buildbots for more than two days.
msg391713 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-23 17:24
My apologies, I was having hardware issues.

Checking it out now.
msg391716 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-23 17:57
Thanks a lot Ethan. I will wait then for the investigation.
msg391753 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-24 02:08
New changeset 6c681e1a4aa2dbca61be9a26c9257d7d25fa29a7 by Ethan Furman in branch 'master':
bpo-38659: [Enum] do not check '_inverted_' during simple test (GH-25566)
https://github.com/python/cpython/commit/6c681e1a4aa2dbca61be9a26c9257d7d25fa29a7
msg391787 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-04-24 18:02
Pablo, did my latest patch resolved the errors?
msg391807 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-04-24 21:43
Seems that the buildbots are going back to green so I will close the revert PR. THanks a lot, Ethan for the fix and the investigation!
msg403517 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2021-10-09 05:52
Nobody seemed to mention it so I might as well: defining a regular Enum class takes an amount of time that is clearly quadratic in the number of attributes.  That means that the problem is not Python-versus-C or small speed-ups or adding secret APIs to do the simple case faster.  The problem is in the algorithm which needs to be fixed somewhere.  My timings:

number of attributes     time
1500                     0.24s
3000                     0.94s
6000                     3.74s
12000                    15.57s
msg403753 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2021-10-12 19:31
The reason for that quadratic behavior is that for each new member (aka attribute), all the previous members must be checked to see if the new member is a duplicate.

In practice I wouldn't expect this to be a problem as most enums should be fairly small -- are there any real-world examples where there are more than, say, a hundred?
History
Date User Action Args
2021-10-12 19:31:41ethan.furmansetmessages: + msg403753
2021-10-09 05:52:43arigosetnosy: + arigo
messages: + msg403517
2021-04-24 21:43:38pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-04-24 21:43:26pablogsalsetmessages: + msg391807
2021-04-24 18:02:07ethan.furmansetmessages: + msg391787
2021-04-24 02:08:40ethan.furmansetmessages: + msg391753
2021-04-23 22:08:51ethan.furmansetpull_requests: + pull_request24285
2021-04-23 17:57:39pablogsalsetmessages: + msg391716
2021-04-23 17:24:54ethan.furmansetmessages: + msg391713
2021-04-23 17:08:53pablogsalsetmessages: + msg391711
2021-04-23 17:07:37pablogsalsetpull_requests: + pull_request24279
2021-04-23 11:13:58pablogsalsetmessages: + msg391679
2021-04-23 10:57:51pablogsalsetmessages: + msg391678
2021-04-22 18:44:11pablogsalsetmessages: + msg391621
2021-04-22 15:30:52vstinnersetnosy: + vstinner
messages: + msg391604
2021-04-22 04:06:17ethan.furmansetmessages: + msg391571
2021-04-22 03:23:27ethan.furmansetmessages: + msg391570
2021-04-22 03:19:00ethan.furmansetmessages: + msg391569
2021-04-22 03:04:17ammar2setstage: resolved -> patch review
pull_requests: + pull_request24234
2021-04-22 02:44:22ammar2setnosy: + ammar2
messages: + msg391568
2021-04-22 00:11:21pablogsalsetmessages: + msg391567
2021-04-22 00:10:07pablogsalsetpriority: normal -> release blocker
2021-04-22 00:08:46pablogsalsetstatus: closed -> open

nosy: + pablogsal
messages: + msg391565

resolution: fixed -> (no value)
2021-04-21 17:29:38ethan.furmansetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-04-21 17:20:51ethan.furmansetmessages: + msg391532
2021-04-21 13:59:36ethan.furmansetpull_requests: + pull_request24219
2021-04-21 13:56:56ethan.furmansetversions: + Python 3.10, - Python 3.9
2021-04-20 02:12:33ethan.furmansetmessages: + msg391421
2021-04-20 01:07:38ethan.furmansetpull_requests: + pull_request24202
2021-04-20 01:05:06ethan.furmansetmessages: + msg391420
2021-04-08 19:44:56ethan.furmansetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request24021
2019-11-01 13:22:54ethan.furmansetassignee: ethan.furman
messages: + msg355815
2019-11-01 01:58:49xtreaksetnosy: + ethan.furman
2019-11-01 01:12:48gregory.p.smithcreate