Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce number of modules imported by runpy #85178

Closed
vstinner opened this issue Jun 17, 2020 · 12 comments
Closed

Reduce number of modules imported by runpy #85178

vstinner opened this issue Jun 17, 2020 · 12 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

BPO 41006
Nosy @vstinner, @shihai1991
PRs
  • bpo-41006: importlib.util no longer imports typing #20938
  • bpo-41006: pkgutil imports lazily re #20939
  • bpo-41006: collections imports lazily heap #20940
  • bpo-41006: Document runpy optimization #20953
  • bpo-41006: Remove init_sys_streams() hack #20954
  • bpo-41006: What's New: less => fewer modules #20955
  • Files
  • mod.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-06-17.21:36:12.127>
    created_at = <Date 2020-06-17.13:58:21.130>
    labels = ['library', '3.10']
    title = 'Reduce number of modules imported by runpy'
    updated_at = <Date 2020-06-17.23:20:57.312>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-06-17.23:20:57.312>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-06-17.21:36:12.127>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2020-06-17.13:58:21.130>
    creator = 'vstinner'
    dependencies = []
    files = ['49240']
    hgrepos = []
    issue_num = 41006
    keywords = ['patch']
    message_count = 12.0
    messages = ['371741', '371742', '371746', '371750', '371761', '371765', '371766', '371774', '371776', '371777', '371781', '371782']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'shihai1991']
    pr_nums = ['20938', '20939', '20940', '20953', '20954', '20955']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue41006'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    Currently, the runpy module imports many modules. runpy is used by "python3 -m module". I propose to attempt to reduce the number of imports to reduce Python startup time.

    With my local changes, I reduce Python startup time from 24 ms to 18 ms:

    Mean +- std dev: [ref] 24.3 ms +- 0.2 ms -> [patch] 18.0 ms +- 0.3 ms: 1.35x faster (-26%)

    Timing measured by:

    ./python -m venv env
    python -m pyperf command -v -o patch.json -- env/bin/python -m empty

    Currently, runpy imports +21 modules:

    • ./python mod.py: Total 33
    • ./python -m mod: Total 54 (+21)

    Example with attached mod.py:

    $ ./python -m mod
    ['__main__',
     '_abc',
     '_codecs',
     '_collections',
     '_collections_abc',
     '_frozen_importlib',
     '_frozen_importlib_external',
     '_functools',
     '_heapq',
     '_imp',
     '_io',
     '_locale',
     '_operator',
     '_signal',
     '_sitebuiltins',
     '_sre',
     '_stat',
     '_thread',
     '_warnings',
     '_weakref',
     '_weakrefset',
     'abc',
     'builtins',
     'codecs',
     'collections',
     'collections.abc',
     'contextlib',
     'copyreg',
     'encodings',
     'encodings.aliases',
     'encodings.ascii',
     'encodings.latin_1',
     'encodings.utf_8',
     'enum',
     'functools',
     'genericpath',
     'heapq',
     'importlib',
     'importlib._bootstrap',
     'importlib._bootstrap_external',
     'importlib.abc',
     'importlib.machinery',
     'importlib.util',
     'io',
     'itertools',
     'keyword',
     'marshal',
     'operator',
     'os',
     'os.path',
     'pkgutil',
     'posix',
     'posixpath',
     're',
     'reprlib',
     'runpy',
     'site',
     'sre_compile',
     'sre_constants',
     'sre_parse',
     'stat',
     'sys',
     'time',
     'types',
     'typing',
     'typing.io',
     'typing.re',
     'warnings',
     'weakref',
     'zipimport']
    Total 70

    @vstinner vstinner added 3.10 only security fixes stdlib Python modules in the Lib dir labels Jun 17, 2020
    @vstinner
    Copy link
    Member Author

    My local changes removed the following imports from runpy:

    • '_heapq',
    • '_locale',
    • '_sre',
    • 'collections.abc',
    • 'copyreg',
    • 'enum',
    • 'heapq',
    • 'importlib.abc',
    • 'itertools',
    • 're',
    • 'sre_compile',
    • 'sre_constants',
    • 'sre_parse',
    • 'typing',
    • 'typing.io',
    • 'typing.re',

    @vstinner
    Copy link
    Member Author

    Currently, runpy imports +21 modules:

    • ./python mod.py: Total 33
    • ./python -m mod: Total 54 (+21)

    Oops sorry, that's with my local changes!

    Currently, runpy imports not less than 37 modules:

    • ./python mod.py: Total 33
    • ./python -m mod: Total 70 (+37)

    @vstinner
    Copy link
    Member Author

    See also bpo-40275: "test.support has way too many imports".

    @vstinner
    Copy link
    Member Author

    I created 3 PRs. I have a few more local branches to avoid types and itertools imports. I may create PRs for these ones as well.

    @vstinner
    Copy link
    Member Author

    New changeset 7824cc0 by Victor Stinner in branch 'master':
    bpo-41006: collections imports lazily heap (GH-20940)
    7824cc0

    @vstinner
    Copy link
    Member Author

    New changeset 98ce7b1 by Victor Stinner in branch 'master':
    bpo-41006: pkgutil imports lazily re (GH-20939)
    98ce7b1

    @vstinner
    Copy link
    Member Author

    New changeset 9e09849 by Victor Stinner in branch 'master':
    bpo-41006: importlib.util no longer imports typing (GH-20938)
    9e09849

    @vstinner
    Copy link
    Member Author

    I close the issue. Making more imports lazy doesn't bring much benefit.

    --

    With the 3 changes, runpy now imports 55 modules, instead of 70.

    The startup time is 1.3x faster, 18 ms instead of 24 ms:

    Mean +- std dev: [before] 23.7 ms +- 0.4 ms -> [after] 17.8 ms +- 0.6 ms: 1.33x faster (-25%)

    --

    Avoiding itertools and types doesn't bring much benefit:

    Mean +- std dev: [after] 17.8 ms +- 0.6 ms -> [WIP] 17.2 ms +- 0.4 ms: 1.03x faster (-3%)

    @vstinner
    Copy link
    Member Author

    New changeset 4c18fc8 by Victor Stinner in branch 'master':
    bpo-41006: Document the runpy optimization (GH-20953)
    4c18fc8

    @vstinner
    Copy link
    Member Author

    New changeset 1bf7959 by Victor Stinner in branch 'master':
    bpo-41006: Remove init_sys_streams() hack (GH-20954)
    1bf7959

    @vstinner
    Copy link
    Member Author

    New changeset 2c2a4f3 by Victor Stinner in branch 'master':
    bpo-41006: What's New: less => fewer modules (GH-20955)
    2c2a4f3

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant