Message301240
> I would prefer to use the _Py_IDENTIFIER API rather than using _PyDict_GetItem_KnownHash().
Do you mean for the table of slot descriptions? I'm not sure that the effect would be comparable.
> Maybe there are other opportunities for optimization?
I would guess so. According to callgrind, for the timeit run, the original implementation has the following call fanout for the type creation:
- 1755 calls to type_call()
- 3850 calls to type_new()
- 224000 calls to update_one_slot()
- 317000 calls to_PyType_Lookup()
- 566000 calls to PyDict_GetItem()
That's 147 calls to PyDict_GetItem() per type creation, just inside of _PyType_Lookup().
About 20% of the time in PyDict_GetItem() is spent in PyErr_Clear(), and 23/26% respectively in lookdict_unicode_nodummy() (349000 calls) and lookdict_unicode() (278000 calls).
There is probably some space for algorithmic improvements here, in order to reduce the overall number of calls. For example, I wonder if bypassing the method cache while building the type might help. The cache maintenance seems to amount for something like 30% of the time spent in _PyType_Lookup(). Or reversing the inspection order, i.e. running over the type attributes and looking up the slots, instead of running over the slots and looking up the attributes. Or using a faster set intersection algorithm for that purpose. Something like that.
OTOH, quick gains might be worth it, but since applications that find their bottleneck in type creation are probably rare, I doubt that it's worth putting weeks into this. Even the application startup time is unlikely to be domination by *type* creations, rather than *object* instantiations. |
|
Date |
User |
Action |
Args |
2017-09-04 18:47:26 | scoder | set | recipients:
+ scoder, rhettinger, pitrou, vstinner, methane, serhiy.storchaka |
2017-09-04 18:47:26 | scoder | set | messageid: <1504550846.34.0.0577626112705.issue31336@psf.upfronthosting.co.za> |
2017-09-04 18:47:26 | scoder | link | issue31336 messages |
2017-09-04 18:47:26 | scoder | create | |
|