Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abs_paths() in site.py is slow #73778

Closed
methane opened this issue Feb 17, 2017 · 9 comments
Closed

abs_paths() in site.py is slow #73778

methane opened this issue Feb 17, 2017 · 9 comments
Labels
3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@methane
Copy link
Member

methane commented Feb 17, 2017

BPO 29592
Nosy @freddrake, @brettcannon, @gpshead, @ncoghlan, @methane, @ericsnowcurrently
PRs
  • bpo-29592: site: skip abs_paths() when it's redundant #167
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-03-14.15:52:43.117>
    created_at = <Date 2017-02-17.16:19:42.180>
    labels = ['3.7', 'library', 'performance']
    title = 'abs_paths() in site.py is slow'
    updated_at = <Date 2017-03-24.22:19:52.910>
    user = 'https://github.com/methane'

    bugs.python.org fields:

    activity = <Date 2017-03-24.22:19:52.910>
    actor = 'methane'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-03-14.15:52:43.117>
    closer = 'methane'
    components = ['Library (Lib)']
    creation = <Date 2017-02-17.16:19:42.180>
    creator = 'methane'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29592
    keywords = []
    message_count = 9.0
    messages = ['288019', '288049', '288052', '288053', '288113', '288128', '288130', '288137', '290185']
    nosy_count = 6.0
    nosy_names = ['fdrake', 'brett.cannon', 'gregory.p.smith', 'ncoghlan', 'methane', 'eric.snow']
    pr_nums = ['167']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue29592'
    versions = ['Python 3.7']

    @methane
    Copy link
    Member Author

    methane commented Feb 17, 2017

    abs_paths() is the another most slow parts of site module.
    We can speedup it if we have C implementation of os.path.abspath() or abs_paths().

    But before trying it, is abs_paths() really needed for now?
    It only rewrite __file__ and __cached__ of modules imported before site is imported.
    What is difference between modules loaded befere / after site module?

    Here is profile output. sysconfig dependency is removed by bpo-29585 patch, and pstat is modified
    to show time in ms instead of sec.

    Ordered by: cumulative time

    ncalls tottime percall cumtime percall filename:lineno(function)
    3/1 0.004 0.001 2.525 2.525 {built-in method builtins.exec}
    1 0.003 0.003 2.524 2.524 x.py:1(<module>)
    1 0.010 0.010 1.806 1.806 site.py:555(main)
    4/3 0.022 0.005 1.179 0.393 <frozen importlib._bootstrap>:958(_find_and_load)
    4/3 0.017 0.004 1.110 0.370 <frozen importlib._bootstrap>:931(_find_and_load_unlocked)
    1 0.195 0.195 0.928 0.928 site.py:99(abs_paths)
    108 0.098 0.001 0.776 0.007 posixpath.py:367(abspath)
    4 0.035 0.009 0.647 0.162 <frozen importlib._bootstrap>:861(_find_spec)
    4 0.005 0.001 0.589 0.147 <frozen importlib._bootstrap_external>:1150(find_spec)
    4 0.043 0.011 0.584 0.146 <frozen importlib._bootstrap_external>:1118(_get_spec)
    2/1 0.012 0.006 0.557 0.557 <frozen importlib._bootstrap>:641(_load_unlocked)
    2/1 0.006 0.003 0.511 0.511 <frozen importlib._bootstrap_external>:673(exec_module)
    108 0.461 0.004 0.493 0.005 posixpath.py:329(normpath)
    16 0.150 0.009 0.453 0.028 <frozen importlib._bootstrap_external>:1234(find_spec)

    @methane methane added 3.7 (EOL) end of life stdlib Python modules in the Lib dir performance Performance or resource usage labels Feb 17, 2017
    @methane
    Copy link
    Member Author

    methane commented Feb 18, 2017

    path normalization was added by
    38cb9f1

    @methane
    Copy link
    Member Author

    methane commented Feb 18, 2017

    https://github.com/python/cpython/blob/master/Lib/importlib/_bootstrap_external.py#L1089-L1095

    While '' in sys.path, it is converted to getcwd() before calling PathHook.
    Is there any chance relative / not normalized path is in sys.path before site is loaded?

    @methane
    Copy link
    Member Author

    methane commented Feb 18, 2017

    $ python2 -S
    Python 2.7.12+ (default, Sep 17 2016, 12:08:02) 
    [GCC 6.2.0 20160914] on linux2
    >>> import x
    >>> x.__file__
    'x.py'
    
    $ python3 -S
    Python 3.6.0 (default, Dec 30 2016, 20:49:54) 
    [GCC 6.2.0 20161005] on linux
    >>> import x
    >>> x.__file__
    '/home/inada-n/x.py'

    I think we can remove abs_paths() in site.py, thanks to _frozen_importlib_external.

    I added all import experts to nosy list.
    Please give me advice.

    @ncoghlan
    Copy link
    Contributor

    Aye, I agree it should be redundant now - import system should always be making __file__ and __cached__ absolutely at import time these days.

    So +1 for dropping this from 3.7 and getting a bit of startup time back.

    @ncoghlan
    Copy link
    Contributor

    CI failure indicating it isn't redundant, but could still potentially be made faster since non-absolute paths should be relatively rare now:

    ======================================================================

    FAIL: test_s_option (test.test_site.HelperFunctionsTests)

    ----------------------------------------------------------------------

    Traceback (most recent call last):

    File "/home/travis/build/python/cpython/Lib/test/test_site.py", line 173, in test_s_option

        self.assertIn(usersite, sys.path)

    AssertionError: '/home/travis/.local/lib/python3.7/site-packages' not found in ['', '/usr/local/lib/python37.zip', '/home/travis/build/python/cpython/Lib', '/home/travis/build/python/cpython/build/lib.linux-x86_64-3.7-pydebug']

    ======================================================================

    FAIL: test_abs_paths (test.test_site.ImportSideEffectTests)

    ----------------------------------------------------------------------

    Traceback (most recent call last):

    File "/home/travis/build/python/cpython/Lib/test/test_site.py", line 365, in test_abs_paths

        .format(os__file__.decode('ascii')))

    AssertionError: False is not true : expected absolute path, got ../../Lib/os.py

    ----------------------------------------------------------------------

    @ncoghlan
    Copy link
    Contributor

    Note that "os" doesn't get imported normally, it gets injected as part of the importlib bootstrapping process (since _bootstrap_external.py needs the os module in order to work).

    @methane
    Copy link
    Member Author

    methane commented Feb 19, 2017

    I got it.
    removeduppaths() may change relpath in sys.path to absolute path.
    abs_paths() changes __file__ and __cached__ for consistency with the changed sys.path.

    I updated PR 167 to call abs_paths() only if removeduppaths() modified sys.path.
    Strictly speaking, abs_paths() is required only when removeduppaths() converted relpath to absolute path.

    But because duplicated paths are rare too, I think this approach is practical enough.

    @methane methane closed this as completed Mar 14, 2017
    @methane
    Copy link
    Member Author

    methane commented Mar 24, 2017

    New changeset 2e4e011 by INADA Naoki in branch 'master':
    bpo-29592: site: skip abs_paths() when it's redundant (GH-167)
    2e4e011

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants