Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local import is too slow #66747

Closed
serhiy-storchaka opened this issue Oct 5, 2014 · 18 comments
Closed

Local import is too slow #66747

serhiy-storchaka opened this issue Oct 5, 2014 · 18 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@serhiy-storchaka
Copy link
Member

BPO 22557
Nosy @brettcannon, @ncoghlan, @pitrou, @vstinner, @ericvsmith, @ericsnowcurrently, @serhiy-storchaka
Files
  • faster_import.patch
  • faster_import_2.patch
  • faster_import_3.patch
  • faster_import_4.patch
  • faster_import_4.patch
  • faster_import_5.patch
  • faster_import_pkg.patch
  • faster_import_6.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-08-06.20:35:40.208>
    created_at = <Date 2014-10-05.12:29:01.710>
    labels = ['interpreter-core', 'performance']
    title = 'Local import is too slow'
    updated_at = <Date 2016-08-06.20:35:40.195>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-08-06.20:35:40.195>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-08-06.20:35:40.208>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2014-10-05.12:29:01.710>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['36814', '36816', '36912', '43471', '43491', '43566', '43567', '43869']
    hgrepos = []
    issue_num = 22557
    keywords = ['patch']
    message_count = 18.0
    messages = ['228561', '228562', '228565', '228570', '228573', '228574', '228575', '229013', '229286', '268850', '268934', '268936', '268937', '268938', '269400', '271173', '271846', '271847']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'ncoghlan', 'pitrou', 'vstinner', 'eric.smith', 'python-dev', 'eric.snow', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue22557'
    versions = ['Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    Locale import is too slow in comparison with caching module in a global or even in sys.modules.

    >>> import timeit
    >>> def f():
    ...     import locale
    ... 
    >>> min(timeit.repeat(f, number=100000, repeat=10))
    0.4501200000013341
    >>> _locale = None
    >>> def g():
    ...     global _locale
    ...     if _locale is None:
    ...         import _locale
    ... 
    >>> min(timeit.repeat(g, number=100000, repeat=10))
    0.07821200000034878
    >>> import sys
    >>> def h():
    ...     try:
    ...         locale = sys.modules['locale']
    ...     except KeyError:
    ...         import locale
    ... 
    >>> min(timeit.repeat(h, number=100000, repeat=10))
    0.12357599999813829

    I think there is an overhead of look up __import__, packing arguments in a tuple and calling a function. This can be omitted by looking first in sys.module and calling __import__ only when nothing was found.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Oct 5, 2014
    @serhiy-storchaka serhiy-storchaka changed the title Locale import is too slow Local import is too slow Oct 5, 2014
    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    This would sound reasonable to me, but I wonder if it may change behaviour with weird custom __import__ overrides.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Oct 5, 2014

    __import__ is intended as an absolute override (including of the sys.modules cache lookup), so we can't bypass it without breaking backwards compatibility.

    It's possible there is room for other optimisations that don't break the import override semantics (such as a fast path for when __import__ is the standard import function).

    @serhiy-storchaka
    Copy link
    Member Author

    I'm not experienced in import machinery. Here is preliminary patch which
    implements my idea for particular case.

    Performance effect is almost so good as manual caching in a global.

    >>> import timeit
    >>> def f():
    ...      import locale
    ... 
    >>> min(timeit.repeat(f, number=100000, repeat=10))
    0.09563599999819417

    Of course it breaks tests.

    It's possible there is room for other optimisations that don't break the
    import override semantics (such as a fast path for when __import__ is the
    standard import function).

    Good idea.

    @serhiy-storchaka
    Copy link
    Member Author

    Second version of the patch uses fast patch only when builtin __import__ is
    not overridden. It is slightly slower (due to lookup of __import__).

    >>> import timeit
    >>> def f():
    ...     import locale
    ... 
    >>> min(timeit.repeat(f, number=100000, repeat=10))
    0.10502300000371179

    The code is simpler, but still some cumbersome. It would be good to optimize
    also "from locale import getlocale".

    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    I would suggest factoring out IMPORT_NAME into a separate import_name() function, like is already one for import_from().

    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    Some more general comments about this:

    • Let's keep in mind the absolute numbers. 0.4501200000013341 for 100000 iterations is 4.5ms per iteration. This is not very slow by CPython's standards.

    • I wonder if you ran your benchmark in debug mode or if your CPU is slow :-) I get around 0.5ms per iteration here.

    @terryjreedy terryjreedy changed the title Local import is too slow Locale import is too slow Oct 10, 2014
    @pitrou
    Copy link
    Member

    pitrou commented Oct 10, 2014

    The issue is local imports, not imports of the locale module :-)

    @pitrou pitrou changed the title Locale import is too slow Local import is too slow Oct 10, 2014
    @serhiy-storchaka
    Copy link
    Member Author

    Yes, my CPU is slow.

    Here is a patch which factors out IMPORT_NAME into a separate import_name() function and adds optimization for more general case when __import__ is not overloaded.

    @serhiy-storchaka
    Copy link
    Member Author

    Fixed bugs making test_importlib failing.

    Microbenchmark results on faster machine:

    $ ./python -m timeit 'import locale'
    Unpatched:  1000000 loops, best of 3: 0.839 usec per loop
    Patched:    10000000 loops, best of 3: 0.176 usec per loop
    
    $ ./python -m timeit 'import os.path'
    Unpatched:  100000 loops, best of 3: 2.02 usec per loop
    Patched:    1000000 loops, best of 3: 1.77 usec per loop
    
    $ ./python -m timeit 'from locale import getlocale'
    Unpatched:  100000 loops, best of 3: 3.69 usec per loop
    Patched:    100000 loops, best of 3: 3.39 usec per loop

    And it looks to me that there is a bug in existing code (opened separate bpo-27352).

    0.839 usec is not very slow by CPython's standards, but is equal to about 50 assignments to local variable, 15 attribute revolvings or 5 simple function calls. If some module is optionally needed in fast function, the overhead of local import can be significant. We can lazily initialize global variable (the second example in msg228561), but this code looks more cumbersome.

    @brettcannon
    Copy link
    Member

    Do you happen to know why you didn't get a review link for your patch, Serhiy?

    @vstinner
    Copy link
    Member

    Do you happen to know why you didn't get a review link for your patch, Serhiy?

    faster_import_4.patch is based on the revision d736c9490333 which is not part of the CPython repository.

    @vstinner
    Copy link
    Member

    I rebased faster_import_4.patch on default.

    @serhiy-storchaka
    Copy link
    Member Author

    Thanks Victor.

    @serhiy-storchaka
    Copy link
    Member Author

    faster_import_pkg.patch optimizes also an import of names with dots.

    $ ./python -m timeit 'import os.path'
    Unpatched:                100000 loops, best of 3: 2.08 usec per loop
    faster_import_5.patch:    1000000 loops, best of 3: 1.79 usec per loop
    faster_import_pkg.patch:  1000000 loops, best of 3: 0.474 usec per loop

    @serhiy-storchaka
    Copy link
    Member Author

    Seems not all such easy. Looking in sys.module is not enough, we should check __spec__._initializing.

    Following patch moves optimizations to PyImport_ImportModuleLevelObject (C implementation of standard __import__). Main optimizations:

    1. PyImport_ImportModuleLevelObject is called directly if builtins.__import__ is standard __import__.

    2. Import lock is not acquired for looking up in sys.modules and other operations. Some of these operations are atomic in C (guarded by GIL), others are used only for optimization and race condition can cause only insignificant slow down.

    3. Avoided creating empty dict for globals, looking up __package__ and __spec__ if they are not needed.

    4. Saving standard __import__ in interpreter state.

    Microbenchmarking results:

    $ ./python -m timeit 'import os'
    Unpatched:  1000000 loops, best of 3: 0.845 usec per loop
    Patched:    1000000 loops, best of 3: 0.338 usec per loop
    
    $ ./python -m timeit 'import os.path'
    Unpatched:  100000 loops, best of 3: 2.07 usec per loop
    Patched:    1000000 loops, best of 3: 0.884 usec per loop
    
    $ ./python -m timeit 'from os import path'
    Unpatched:  100000 loops, best of 3: 3.7 usec per loop
    Patched:    100000 loops, best of 3: 2.77 usec per loop

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you for your review Brett.

    @serhiy-storchaka serhiy-storchaka self-assigned this Aug 2, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 2, 2016

    New changeset 64f195790a3a by Serhiy Storchaka in branch 'default':
    Issue bpo-22557: Now importing already imported modules is up to 2.5 times faster.
    https://hg.python.org/cpython/rev/64f195790a3a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants