New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enum classes cause slow startup time #82840
Comments
Creating an enum subclass (ie: defining an enum) is slow. This dramatically impacts startup time of Python programs that import a bunch of potentially needed constant definitions at startup before any proper code executes. How slow? So slow that a module defining a ~300 enums takes nearly 100ms just to import from its pyc file. We've known this, we should do something about it. (Even if it means implementing the guts of the magic enum machinery in C.) ie, it came up in https://bugs.python.org/issue28637 as a stdlib startup time regression and is likely to come up in similar contexts elsewhere. |
I was just looking at this problem, and creating a bare-bones, no safety belts version for use in the stdlib (no patch yet) which decreases Enum creation from 14x slower to only 6x slower. (Comparing to a class with simple attributes.) Not sure if that's enough improvement, though. If it needs to be even faster, a C version of that simplified Enum shouldn't be too hard. Anyone that uses the _simple_enum, though, should have a test that uses the full Enum and compares the two to make sure nothing got lost in translation. |
Commit a02cb47 fails to build in the refleak buildbots: https://buildbot.python.org/all/#/builders/75/builds/2/steps/5/logs/stdio Example failure: ❯ ./python -m test test_enum -R :
0:00:00 load avg: 1.81 Run tests sequentially
0:00:00 load avg: 1.81 [1/1] test_enum
beginning 9 repetitions
123456789
.test test_enum failed -- Traceback (most recent call last):
File "/home/pablogsal/github/python/master/Lib/test/test_enum.py", line 3700, in test_convert_repr_and_str
self.assertEqual(format(test_type.CONVERT_STRING_TEST_NAME_A), '5')
AssertionError: 'CONVERT_STRING_TEST_NAME_A' != '5'
- CONVERT_STRING_TEST_NAME_A
+ 5 test_enum failed == Tests result: FAILURE == 1 test failed: Total duration: 586 ms |
Can someone take a look? As per the buildbot policy (https://discuss.python.org/t/policy-to-revert-commits-on-buildbot-failure/404) we may need to revert it if is not fixed in 24 because of the risk of masking errors. |
Looks like this is the issue described in the comment here: https://github.com/python/cpython/blob/master/Lib/test/test_enum.py#L3691-L3692 On the first run you have the correct ('CONVERT_STRING_TEST_NAME_A', 5) but later it turns into ('CONVERT_STRING_TEST_NAME_A', test.test_enum.CONVERT_STRING_TEST_NAME_A) causing double-conversions of the enum elements. This causes the format(x) test to fail. You can re-create the same issue outside of the refleak by adding a simple: def test_convert_repr_and_str_again(self):
self.test_convert_repr_and_str() method. |
|
Actually, I think that fixed the refleak issue as well. Thanks, Ammar! |
def setUp(self):
# Reset the module-level test variables to their original integer
# values, otherwise the already created enum values get converted
# instead. Why not doing that in a tearDown() method instead? What if you run explicitly a single test method? |
Thanks Ammar for the fix. Unfortunately there is still some failures related to this on test_socker: ====================================================================== Traceback (most recent call last):
File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.refleak/build/Lib/test/test_socket.py", line 1969, in test_msgflag_enum
enum._test_simple_enum(CheckedMsgFlag, socket.MsgFlag)
File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.refleak/build/Lib/enum.py", line 1664, in _test_simple_enum
raise TypeError('enum mismatch:\n %s' % '\n '.join(failed))
TypeError: enum mismatch:
'MSG_TRUNC' member mismatch:
extra key '_inverted_' in simple enum member 'MSG_TRUNC'
'MSG_CTRUNC' member mismatch:
extra key '_inverted_' in simple enum member 'MSG_CTRUNC' |
Gentle ping |
Commenting out the enum tests reveals that test_socket has additional problems: 𓋹 ./python.exe -m test test_socket -R 3:3 == Tests result: FAILURE == 1 test failed: Total duration: 2 min 25 sec |
Unfortunately, I am being forced to revert commit a02cb47 due to have it failing of all refleak buildbots for more than two days. |
My apologies, I was having hardware issues. Checking it out now. |
Thanks a lot Ethan. I will wait then for the investigation. |
Pablo, did my latest patch resolved the errors? |
Seems that the buildbots are going back to green so I will close the revert PR. THanks a lot, Ethan for the fix and the investigation! |
Nobody seemed to mention it so I might as well: defining a regular Enum class takes an amount of time that is clearly quadratic in the number of attributes. That means that the problem is not Python-versus-C or small speed-ups or adding secret APIs to do the simple case faster. The problem is in the algorithm which needs to be fixed somewhere. My timings: number of attributes time |
The reason for that quadratic behavior is that for each new member (aka attribute), all the previous members must be checked to see if the new member is a duplicate. In practice I wouldn't expect this to be a problem as most enums should be fairly small -- are there any real-world examples where there are more than, say, a hundred? |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: