classification
Title: Local import is too slow
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: brett.cannon, eric.smith, eric.snow, ncoghlan, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2014-10-05 12:29 by serhiy.storchaka, last changed 2016-08-06 20:35 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
faster_import.patch serhiy.storchaka, 2014-10-05 14:26 review
faster_import_2.patch serhiy.storchaka, 2014-10-05 14:57 review
faster_import_3.patch serhiy.storchaka, 2014-10-14 12:14 review
faster_import_4.patch serhiy.storchaka, 2016-06-19 12:01
faster_import_4.patch vstinner, 2016-06-20 21:18 review
faster_import_5.patch serhiy.storchaka, 2016-06-27 20:13 review
faster_import_pkg.patch serhiy.storchaka, 2016-06-27 20:18 review
faster_import_6.patch serhiy.storchaka, 2016-07-24 16:57 review
Messages (18)
msg228561 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-05 12:29
Locale import is too slow in comparison with caching module in a global or even in sys.modules.

>>> import timeit
>>> def f():
...     import locale
... 
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.4501200000013341
>>> _locale = None
>>> def g():
...     global _locale
...     if _locale is None:
...         import _locale
... 
>>> min(timeit.repeat(g, number=100000, repeat=10))
0.07821200000034878
>>> import sys
>>> def h():
...     try:
...         locale = sys.modules['locale']
...     except KeyError:
...         import locale
... 
>>> min(timeit.repeat(h, number=100000, repeat=10))
0.12357599999813829

I think there is an overhead of look up __import__, packing arguments in a tuple and calling a function. This can be omitted by looking first in sys.module and calling __import__ only when nothing was found.
msg228562 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 12:39
This would sound reasonable to me, but I wonder if it may change behaviour with weird custom __import__ overrides.
msg228565 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-10-05 13:37
__import__ is intended as an absolute override (including of the sys.modules cache lookup), so we can't bypass it without breaking backwards compatibility.

It's possible there is room for other optimisations that don't break the import override semantics (such as a fast path for when __import__ is the standard import function).
msg228570 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-05 14:26
I'm not experienced in import machinery. Here is preliminary patch which 
implements my idea for particular case.

Performance effect is almost so good as manual caching in a global.

>>> import timeit
>>> def f():
...      import locale
... 
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.09563599999819417

Of course it breaks tests.

> It's possible there is room for other optimisations that don't break the
> import override semantics (such as a fast path for when __import__ is the
> standard import function).

Good idea.
msg228573 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-05 14:57
Second version of the patch uses fast patch only when builtin __import__ is 
not overridden. It is slightly slower (due to lookup of __import__).

>>> import timeit
>>> def f():
...     import locale
... 
>>> min(timeit.repeat(f, number=100000, repeat=10))
0.10502300000371179

The code is simpler, but still some cumbersome. It would be good to optimize 
also "from locale import getlocale".
msg228574 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 14:58
I would suggest factoring out IMPORT_NAME into a separate import_name() function, like is already one for import_from().
msg228575 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 15:04
Some more general comments about this:

- Let's keep in mind the absolute numbers. 0.4501200000013341 for 100000 iterations is 4.5ms per iteration. This is not very slow by CPython's standards.

- I wonder if you ran your benchmark in debug mode or if your CPU is slow :-) I get around 0.5ms per iteration here.
msg229013 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-10 16:31
The issue is local imports, not imports of the locale module :-)
msg229286 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-14 12:14
Yes, my CPU is slow.

Here is a patch which factors out IMPORT_NAME into a separate import_name() function and adds optimization for more general case when __import__ is not overloaded.
msg268850 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-19 12:01
Fixed bugs making test_importlib failing.

Microbenchmark results on faster machine:

$ ./python -m timeit 'import locale'
Unpatched:  1000000 loops, best of 3: 0.839 usec per loop
Patched:    10000000 loops, best of 3: 0.176 usec per loop

$ ./python -m timeit 'import os.path'
Unpatched:  100000 loops, best of 3: 2.02 usec per loop
Patched:    1000000 loops, best of 3: 1.77 usec per loop

$ ./python -m timeit 'from locale import getlocale'
Unpatched:  100000 loops, best of 3: 3.69 usec per loop
Patched:    100000 loops, best of 3: 3.39 usec per loop

And it looks to me that there is a bug in existing code (opened separate issue27352).

0.839 usec is not very slow by CPython's standards, but is equal to about 50 assignments to local variable, 15 attribute revolvings or 5 simple function calls. If some module is optionally needed in fast function, the overhead of local import can be significant. We can lazily initialize global variable (the second example in msg228561), but this code looks more cumbersome.
msg268934 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-06-20 21:11
Do you happen to know why you didn't get a review link for your patch, Serhiy?
msg268936 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-06-20 21:17
> Do you happen to know why you didn't get a review link for your patch, Serhiy?

faster_import_4.patch is based on the revision d736c9490333 which is not part of the CPython repository.
msg268937 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-06-20 21:18
I rebased  faster_import_4.patch on default.
msg268938 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-20 21:28
Thanks Victor.
msg269400 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-27 20:18
faster_import_pkg.patch optimizes also an import of names with dots.

$ ./python -m timeit 'import os.path'
Unpatched:                100000 loops, best of 3: 2.08 usec per loop
faster_import_5.patch:    1000000 loops, best of 3: 1.79 usec per loop
faster_import_pkg.patch:  1000000 loops, best of 3: 0.474 usec per loop
msg271173 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-24 16:57
Seems not all such easy. Looking in sys.module is not enough, we should check __spec__._initializing.

Following patch moves optimizations to PyImport_ImportModuleLevelObject (C implementation of standard __import__). Main optimizations:

1. PyImport_ImportModuleLevelObject is called directly if builtins.__import__ is standard __import__.

2. Import lock is not acquired for looking up in sys.modules and other operations. Some of these operations are atomic in C (guarded by GIL), others are used only for optimization and race condition can cause only insignificant slow down.

3. Avoided creating empty dict for globals, looking up __package__ and __spec__ if they are not needed.

4. Saving standard __import__ in interpreter state.

Microbenchmarking results:

$ ./python -m timeit 'import os'
Unpatched:  1000000 loops, best of 3: 0.845 usec per loop
Patched:    1000000 loops, best of 3: 0.338 usec per loop

$ ./python -m timeit 'import os.path'
Unpatched:  100000 loops, best of 3: 2.07 usec per loop
Patched:    1000000 loops, best of 3: 0.884 usec per loop

$ ./python -m timeit 'from os import path'
Unpatched:  100000 loops, best of 3: 3.7 usec per loop
Patched:    100000 loops, best of 3: 2.77 usec per loop
msg271846 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-02 19:52
Thank you for your review Brett.
msg271847 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-02 19:53
New changeset 64f195790a3a by Serhiy Storchaka in branch 'default':
Issue #22557: Now importing already imported modules is up to 2.5 times faster.
https://hg.python.org/cpython/rev/64f195790a3a
History
Date User Action Args
2016-08-06 20:35:40serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-08-02 19:53:00python-devsetnosy: + python-dev
messages: + msg271847
2016-08-02 19:52:29serhiy.storchakasetassignee: serhiy.storchaka
messages: + msg271846
2016-07-24 16:57:09serhiy.storchakasetfiles: + faster_import_6.patch

messages: + msg271173
2016-06-27 20:18:57serhiy.storchakasetfiles: + faster_import_pkg.patch

messages: + msg269400
2016-06-27 20:13:27serhiy.storchakasetfiles: + faster_import_5.patch
2016-06-20 21:28:28serhiy.storchakasetmessages: + msg268938
2016-06-20 21:18:44vstinnersetfiles: + faster_import_4.patch

messages: + msg268937
2016-06-20 21:17:36vstinnersetnosy: + vstinner
messages: + msg268936
2016-06-20 21:11:31brett.cannonsetmessages: + msg268934
2016-06-19 12:01:09serhiy.storchakasetfiles: + faster_import_4.patch

stage: patch review
messages: + msg268850
versions: + Python 3.6, - Python 3.5
2014-10-14 12:14:35serhiy.storchakasetfiles: + faster_import_3.patch

messages: + msg229286
2014-10-10 16:31:33pitrousetmessages: + msg229013
title: Locale import is too slow -> Local import is too slow
2014-10-10 16:30:34terry.reedysettitle: Local import is too slow -> Locale import is too slow
2014-10-05 15:04:42pitrousetmessages: + msg228575
2014-10-05 14:58:06pitrousetmessages: + msg228574
2014-10-05 14:57:17serhiy.storchakasetfiles: + faster_import_2.patch

messages: + msg228573
2014-10-05 14:26:34serhiy.storchakasetfiles: + faster_import.patch
keywords: + patch
messages: + msg228570
2014-10-05 13:54:18eric.smithsetnosy: + eric.smith
2014-10-05 13:37:36ncoghlansetmessages: + msg228565
2014-10-05 12:39:08pitrousetmessages: + msg228562
2014-10-05 12:29:53serhiy.storchakasettitle: Locale import is too slow -> Local import is too slow
2014-10-05 12:29:01serhiy.storchakacreate