msg236234 - (view) |
Author: Craig Holmquist (craigh) |
Date: 2015-02-19 21:59 |
Running the attached test script:
$ time python test.py enum
real 0m6.546s
user 0m6.530s
sys 0m0.007s
$ time python test.py int
real 0m0.384s
user 0m0.377s
sys 0m0.000s
I encountered this with a script that yielded a sequence of objects (potentially a few hundred thousand of them) and categorized them with instances of an Enum subclass. The consumer of that iteration processes each object with a switch-case-like comparison of the category, checking it sequentially against each instance of the Enum. This seems like a fairly common use case.
From cProfile it looks like EnumMeta.__getattr__ and _is_dunder are the main bottlenecks:
[...]
7/1 0.000 0.000 0.000 0.000 abc.py:194(__subclasscheck__)
1 0.000 0.000 0.001 0.001 enum.py:1(<module>)
3 0.000 0.000 0.000 0.000 enum.py:132(<genexpr>)
2000021 0.988 0.000 0.988 0.000 enum.py:16(_is_dunder)
19 0.000 0.000 0.000 0.000 enum.py:24(_is_sunder)
2000002 1.825 0.000 2.813 0.000 enum.py:241(__getattr__)
17 0.000 0.000 0.000 0.000 enum.py:282(__setattr__)
3 0.000 0.000 0.000 0.000 enum.py:342(_get_mixins_)
3 0.000 0.000 0.000 0.000 enum.py:387(_find_new_)
[...]
|
msg236236 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-02-19 22:12 |
Craig Holmquist wrote:
---------------------
> The consumer of that iteration processes each object with a switch-case-like
> comparison of the category, checking it sequentially against each instance
> of the Enum.
So for every object you compare against every Enum member? Is there a reason you don't just use the lookup capability?
class Category(Enum):
tiny = 1
medium = 2
large = 3
cat = Category(obj.category) # assumes obj.category is 1, 2, or 3
|
msg236237 - (view) |
Author: Craig Holmquist (craigh) |
Date: 2015-02-19 22:19 |
I may not have been clear before. What I mean is, code like this:
for obj in get_objects():
if obj.category == Cat.cat1:
#do something
elif obj.category == Cat.cat2:
#do something else
elif obj.category == Cat.cat3 or obj.category == Cat.cat4:
#...
obj.category is already an instance of Cat, in other words. The consumer is using it to determine what to do with each obj.
|
msg236239 - (view) |
Author: Craig Holmquist (craigh) |
Date: 2015-02-19 22:35 |
It seems like performance is drastically improved by doing this:
class Category(Enum):
tiny = 1
medium = 2
large = 3
tiny = Category.tiny
medium = Category.medium
In other words, resolving Category.tiny and Category.medium is what's slow. The comparison itself is very fast.
|
msg236243 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-02-19 23:41 |
Yup, you have it figured out. It's the lookup that is the slowdown.
When performance is an issue one of the standard tricks is to create a local name, like you did with "tiny = Category.tiny".
For the curious (taken from the docstring for Enum.__getattr__):
We use __getattr__ instead of descriptors or inserting into the enum
class' __dict__ in order to support `name` and `value` being both
properties for enum members (which live in the class' __dict__) and
enum members themselves.
It is possible to store all the enum members /except/ for 'name' and 'value' in the class' __dict__, but I'm not sure it's worth the extra complication.
|
msg236258 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-02-20 09:29 |
Out of curiousity I tried: took two new lines, one modified line, and one comment. :)
|
msg236259 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-02-20 09:30 |
Oh, and the slowdown dropped from 20 to 3 (for non-DynamicClassAttributes -- which is probably more than 99% of them).
|
msg236260 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2015-02-20 09:39 |
I don't understand the patch, but 3x slower instead of 20x slower is a huge optimization :-) Do you plan to change Python 3.5 *and* Python 3.4?
|
msg236559 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-02-25 02:28 |
This isn't a change to the API or any visible user behavior (besides performance), so I don't see a reason to not add it to 3.4.
|
msg237816 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-03-10 22:06 |
Larry, I have a very small patch (~4 lines) that does change user behavior or the API, but does have a significant performance boost.
I'm still learning what is/is not okay to add to maintenance releases, so wanted to run this by you.
|
msg237844 - (view) |
Author: Larry Hastings (larry) * |
Date: 2015-03-11 05:35 |
My inclination is 3.5 only. Barry, do you want to argue for this going into 3.4?
|
msg237846 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-03-11 06:17 |
Argh, sorry -- that was supposed to be *does not* change user behavior nor the API, it's *just* a performance increase.
Does that change your inclination?
|
msg237847 - (view) |
Author: Larry Hastings (larry) * |
Date: 2015-03-11 07:30 |
Oh, I read the code. But it's a performance hack, and the rules say we only accept security fixes and bug fixes at this stage of the release, and they're the rules for good reasons.
|
msg237865 - (view) |
Author: Barry A. Warsaw (barry) * |
Date: 2015-03-11 13:46 |
Poor performance could fall under the category of bug fixes, so for an in-maintenance mode release, a fix that does not in any way change user visible behavior could be acceptable. It would probably be fine for 3.4 but I'm just +0 on it. Larry's call.
|
msg237866 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-03-11 14:04 |
This is not a regression (there were no enums before 3.4), slow down is not critical (only constant factor, not increased computational complexity), there is a workaround, and the code that just use constants that were converted to IntEnum is not affected. I'm -0 on it.
|
msg237867 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-03-11 14:27 |
In getting everything fixed up and tested I realized there was one slight user-facing change: with this patch it is now possible to say:
SomeEnum.SomeMember = SomeMember
In other words, it is possible to set a value on the class as long as it is the same value that already existed.
3.5 sounds good.
|
msg237877 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2015-03-11 15:43 |
New changeset 2545bfe0d273 by Ethan Furman in branch 'default':
Close issue23486: performance boost for enum member lookup
https://hg.python.org/cpython/rev/2545bfe0d273
|
msg237878 - (view) |
Author: Ethan Furman (ethan.furman) * |
Date: 2015-03-11 15:45 |
Slight reordering of code removed the one user visible change.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:12 | admin | set | github: 67674 |
2015-03-11 15:45:36 | ethan.furman | set | messages:
+ msg237878 |
2015-03-11 15:43:55 | python-dev | set | status: open -> closed
nosy:
+ python-dev messages:
+ msg237877
resolution: fixed stage: commit review -> resolved |
2015-03-11 14:27:05 | ethan.furman | set | messages:
+ msg237867 versions:
- Python 3.4 |
2015-03-11 14:04:02 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg237866
|
2015-03-11 13:46:44 | barry | set | messages:
+ msg237865 |
2015-03-11 07:30:02 | larry | set | messages:
+ msg237847 |
2015-03-11 06:17:24 | ethan.furman | set | messages:
+ msg237846 |
2015-03-11 05:35:00 | larry | set | messages:
+ msg237844 |
2015-03-10 22:07:02 | ethan.furman | set | messages:
- msg237815 |
2015-03-10 22:06:10 | ethan.furman | set | nosy:
+ larry messages:
+ msg237816
|
2015-03-10 22:04:54 | ethan.furman | set | assignee: ethan.furman messages:
+ msg237815 stage: commit review |
2015-02-25 02:28:29 | ethan.furman | set | nosy:
+ barry, eli.bendersky messages:
+ msg236559
|
2015-02-20 09:39:08 | vstinner | set | nosy:
+ vstinner
messages:
+ msg236260 versions:
+ Python 3.5 |
2015-02-20 09:30:14 | ethan.furman | set | messages:
+ msg236259 |
2015-02-20 09:29:04 | ethan.furman | set | files:
+ issue23486.stoneleaf.01.patch keywords:
+ patch messages:
+ msg236258
|
2015-02-19 23:41:15 | ethan.furman | set | messages:
+ msg236243 title: Enum comparisons are 20x slower than comparing equivalent ints -> Enum member lookup is 20x slower than normal class attribute lookup |
2015-02-19 22:35:38 | craigh | set | messages:
+ msg236239 |
2015-02-19 22:19:34 | craigh | set | messages:
+ msg236237 |
2015-02-19 22:12:02 | ethan.furman | set | messages:
+ msg236236 |
2015-02-19 22:03:28 | ethan.furman | set | nosy:
+ ethan.furman
|
2015-02-19 21:59:13 | craigh | create | |