Message 301240 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	scoder
Recipients	methane, pitrou, rhettinger, scoder, serhiy.storchaka, vstinner
Date	2017-09-04.18:47:26
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1504550846.34.0.0577626112705.issue31336@psf.upfronthosting.co.za>
In-reply-to

Content
> I would prefer to use the _Py_IDENTIFIER API rather than using _PyDict_GetItem_KnownHash(). Do you mean for the table of slot descriptions? I'm not sure that the effect would be comparable. > Maybe there are other opportunities for optimization? I would guess so. According to callgrind, for the timeit run, the original implementation has the following call fanout for the type creation: - 1755 calls to type_call() - 3850 calls to type_new() - 224000 calls to update_one_slot() - 317000 calls to_PyType_Lookup() - 566000 calls to PyDict_GetItem() That's 147 calls to PyDict_GetItem() per type creation, just inside of _PyType_Lookup(). About 20% of the time in PyDict_GetItem() is spent in PyErr_Clear(), and 23/26% respectively in lookdict_unicode_nodummy() (349000 calls) and lookdict_unicode() (278000 calls). There is probably some space for algorithmic improvements here, in order to reduce the overall number of calls. For example, I wonder if bypassing the method cache while building the type might help. The cache maintenance seems to amount for something like 30% of the time spent in _PyType_Lookup(). Or reversing the inspection order, i.e. running over the type attributes and looking up the slots, instead of running over the slots and looking up the attributes. Or using a faster set intersection algorithm for that purpose. Something like that. OTOH, quick gains might be worth it, but since applications that find their bottleneck in type creation are probably rare, I doubt that it's worth putting weeks into this. Even the application startup time is unlikely to be domination by type creations, rather than object instantiations.

> I would prefer to use the _Py_IDENTIFIER API rather than using _PyDict_GetItem_KnownHash().

Do you mean for the table of slot descriptions? I'm not sure that the effect would be comparable.

> Maybe there are other opportunities for optimization?

I would guess so. According to callgrind, for the timeit run, the original implementation has the following call fanout for the type creation:

- 1755 calls to type_call()
- 3850 calls to type_new()
- 224000 calls to update_one_slot()
- 317000 calls to_PyType_Lookup()
- 566000 calls to PyDict_GetItem()

That's 147 calls to PyDict_GetItem() per type creation, just inside of _PyType_Lookup().

About 20% of the time in PyDict_GetItem() is spent in PyErr_Clear(), and 23/26% respectively in lookdict_unicode_nodummy() (349000 calls) and lookdict_unicode() (278000 calls).

There is probably some space for algorithmic improvements here, in order to reduce the overall number of calls. For example, I wonder if bypassing the method cache while building the type might help. The cache maintenance seems to amount for something like 30% of the time spent in _PyType_Lookup(). Or reversing the inspection order, i.e. running over the type attributes and looking up the slots, instead of running over the slots and looking up the attributes. Or using a faster set intersection algorithm for that purpose. Something like that.

OTOH, quick gains might be worth it, but since applications that find their bottleneck in type creation are probably rare, I doubt that it's worth putting weeks into this. Even the application startup time is unlikely to be domination by *type* creations, rather than *object* instantiations.

History
Date	User	Action	Args
2017-09-04 18:47:26	scoder	set	recipients: + scoder, rhettinger, pitrou, vstinner, methane, serhiy.storchaka
2017-09-04 18:47:26	scoder	set	messageid: <1504550846.34.0.0577626112705.issue31336@psf.upfronthosting.co.za>
2017-09-04 18:47:26	scoder	link	issue31336 messages
2017-09-04 18:47:26	scoder	create