msg228561 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-10-05 12:29 |
Locale import is too slow in comparison with caching module in a global or even in sys.modules.
>>> import timeit
>>> def f():
... import locale
...
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.4501200000013341
>>> _locale = None
>>> def g():
... global _locale
... if _locale is None:
... import _locale
...
>>> min(timeit.repeat(g, number=100000, repeat=10))
0.07821200000034878
>>> import sys
>>> def h():
... try:
... locale = sys.modules['locale']
... except KeyError:
... import locale
...
>>> min(timeit.repeat(h, number=100000, repeat=10))
0.12357599999813829
I think there is an overhead of look up __import__, packing arguments in a tuple and calling a function. This can be omitted by looking first in sys.module and calling __import__ only when nothing was found.
|
msg228562 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2014-10-05 12:39 |
This would sound reasonable to me, but I wonder if it may change behaviour with weird custom __import__ overrides.
|
msg228565 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2014-10-05 13:37 |
__import__ is intended as an absolute override (including of the sys.modules cache lookup), so we can't bypass it without breaking backwards compatibility.
It's possible there is room for other optimisations that don't break the import override semantics (such as a fast path for when __import__ is the standard import function).
|
msg228570 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-10-05 14:26 |
I'm not experienced in import machinery. Here is preliminary patch which
implements my idea for particular case.
Performance effect is almost so good as manual caching in a global.
>>> import timeit
>>> def f():
... import locale
...
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.09563599999819417
Of course it breaks tests.
> It's possible there is room for other optimisations that don't break the
> import override semantics (such as a fast path for when __import__ is the
> standard import function).
Good idea.
|
msg228573 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-10-05 14:57 |
Second version of the patch uses fast patch only when builtin __import__ is
not overridden. It is slightly slower (due to lookup of __import__).
>>> import timeit
>>> def f():
... import locale
...
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.10502300000371179
The code is simpler, but still some cumbersome. It would be good to optimize
also "from locale import getlocale".
|
msg228574 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2014-10-05 14:58 |
I would suggest factoring out IMPORT_NAME into a separate import_name() function, like is already one for import_from().
|
msg228575 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2014-10-05 15:04 |
Some more general comments about this:
- Let's keep in mind the absolute numbers. 0.4501200000013341 for 100000 iterations is 4.5ms per iteration. This is not very slow by CPython's standards.
- I wonder if you ran your benchmark in debug mode or if your CPU is slow :-) I get around 0.5ms per iteration here.
|
msg229013 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2014-10-10 16:31 |
The issue is local imports, not imports of the locale module :-)
|
msg229286 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-10-14 12:14 |
Yes, my CPU is slow.
Here is a patch which factors out IMPORT_NAME into a separate import_name() function and adds optimization for more general case when __import__ is not overloaded.
|
msg268850 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-06-19 12:01 |
Fixed bugs making test_importlib failing.
Microbenchmark results on faster machine:
$ ./python -m timeit 'import locale'
Unpatched: 1000000 loops, best of 3: 0.839 usec per loop
Patched: 10000000 loops, best of 3: 0.176 usec per loop
$ ./python -m timeit 'import os.path'
Unpatched: 100000 loops, best of 3: 2.02 usec per loop
Patched: 1000000 loops, best of 3: 1.77 usec per loop
$ ./python -m timeit 'from locale import getlocale'
Unpatched: 100000 loops, best of 3: 3.69 usec per loop
Patched: 100000 loops, best of 3: 3.39 usec per loop
And it looks to me that there is a bug in existing code (opened separate issue27352).
0.839 usec is not very slow by CPython's standards, but is equal to about 50 assignments to local variable, 15 attribute revolvings or 5 simple function calls. If some module is optionally needed in fast function, the overhead of local import can be significant. We can lazily initialize global variable (the second example in msg228561), but this code looks more cumbersome.
|
msg268934 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2016-06-20 21:11 |
Do you happen to know why you didn't get a review link for your patch, Serhiy?
|
msg268936 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-06-20 21:17 |
> Do you happen to know why you didn't get a review link for your patch, Serhiy?
faster_import_4.patch is based on the revision d736c9490333 which is not part of the CPython repository.
|
msg268937 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-06-20 21:18 |
I rebased faster_import_4.patch on default.
|
msg268938 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-06-20 21:28 |
Thanks Victor.
|
msg269400 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-06-27 20:18 |
faster_import_pkg.patch optimizes also an import of names with dots.
$ ./python -m timeit 'import os.path'
Unpatched: 100000 loops, best of 3: 2.08 usec per loop
faster_import_5.patch: 1000000 loops, best of 3: 1.79 usec per loop
faster_import_pkg.patch: 1000000 loops, best of 3: 0.474 usec per loop
|
msg271173 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-07-24 16:57 |
Seems not all such easy. Looking in sys.module is not enough, we should check __spec__._initializing.
Following patch moves optimizations to PyImport_ImportModuleLevelObject (C implementation of standard __import__). Main optimizations:
1. PyImport_ImportModuleLevelObject is called directly if builtins.__import__ is standard __import__.
2. Import lock is not acquired for looking up in sys.modules and other operations. Some of these operations are atomic in C (guarded by GIL), others are used only for optimization and race condition can cause only insignificant slow down.
3. Avoided creating empty dict for globals, looking up __package__ and __spec__ if they are not needed.
4. Saving standard __import__ in interpreter state.
Microbenchmarking results:
$ ./python -m timeit 'import os'
Unpatched: 1000000 loops, best of 3: 0.845 usec per loop
Patched: 1000000 loops, best of 3: 0.338 usec per loop
$ ./python -m timeit 'import os.path'
Unpatched: 100000 loops, best of 3: 2.07 usec per loop
Patched: 1000000 loops, best of 3: 0.884 usec per loop
$ ./python -m timeit 'from os import path'
Unpatched: 100000 loops, best of 3: 3.7 usec per loop
Patched: 100000 loops, best of 3: 2.77 usec per loop
|
msg271846 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-08-02 19:52 |
Thank you for your review Brett.
|
msg271847 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-08-02 19:53 |
New changeset 64f195790a3a by Serhiy Storchaka in branch 'default':
Issue #22557: Now importing already imported modules is up to 2.5 times faster.
https://hg.python.org/cpython/rev/64f195790a3a
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:08 | admin | set | github: 66747 |
2016-08-06 20:35:40 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2016-08-02 19:53:00 | python-dev | set | nosy:
+ python-dev messages:
+ msg271847
|
2016-08-02 19:52:29 | serhiy.storchaka | set | assignee: serhiy.storchaka messages:
+ msg271846 |
2016-07-24 16:57:09 | serhiy.storchaka | set | files:
+ faster_import_6.patch
messages:
+ msg271173 |
2016-06-27 20:18:57 | serhiy.storchaka | set | files:
+ faster_import_pkg.patch
messages:
+ msg269400 |
2016-06-27 20:13:27 | serhiy.storchaka | set | files:
+ faster_import_5.patch |
2016-06-20 21:28:28 | serhiy.storchaka | set | messages:
+ msg268938 |
2016-06-20 21:18:44 | vstinner | set | files:
+ faster_import_4.patch
messages:
+ msg268937 |
2016-06-20 21:17:36 | vstinner | set | nosy:
+ vstinner messages:
+ msg268936
|
2016-06-20 21:11:31 | brett.cannon | set | messages:
+ msg268934 |
2016-06-19 12:01:09 | serhiy.storchaka | set | files:
+ faster_import_4.patch
stage: patch review messages:
+ msg268850 versions:
+ Python 3.6, - Python 3.5 |
2014-10-14 12:14:35 | serhiy.storchaka | set | files:
+ faster_import_3.patch
messages:
+ msg229286 |
2014-10-10 16:31:33 | pitrou | set | messages:
+ msg229013 title: Locale import is too slow -> Local import is too slow |
2014-10-10 16:30:34 | terry.reedy | set | title: Local import is too slow -> Locale import is too slow |
2014-10-05 15:04:42 | pitrou | set | messages:
+ msg228575 |
2014-10-05 14:58:06 | pitrou | set | messages:
+ msg228574 |
2014-10-05 14:57:17 | serhiy.storchaka | set | files:
+ faster_import_2.patch
messages:
+ msg228573 |
2014-10-05 14:26:34 | serhiy.storchaka | set | files:
+ faster_import.patch keywords:
+ patch messages:
+ msg228570
|
2014-10-05 13:54:18 | eric.smith | set | nosy:
+ eric.smith
|
2014-10-05 13:37:36 | ncoghlan | set | messages:
+ msg228565 |
2014-10-05 12:39:08 | pitrou | set | messages:
+ msg228562 |
2014-10-05 12:29:53 | serhiy.storchaka | set | title: Locale import is too slow -> Local import is too slow |
2014-10-05 12:29:01 | serhiy.storchaka | create | |