This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: stat cache for import bootstrap
Type: performance Stage: resolved
Components: Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, barry, brett.cannon, christian.heimes, eric.snow, pitrou, r.david.murray, vstinner
Priority: normal Keywords: patch

Created on 2013-10-10 11:20 by christian.heimes, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
import_stat_cache.patch christian.heimes, 2013-10-10 11:20 review
Messages (17)
msg199378 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-10-10 11:20
The import library uses excessive stat() calls. I've implemented a simple cache for the bootstrap module that reduces the amount of stat() calls by almost 1/3 (236 -> 159 on Linux).
msg199385 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-10 13:34
See also #14604.
msg199387 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-10 14:07
Benchmarks?
msg199400 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-10 16:50
A cursory look at the patch suggests that the cache use is permanent and so any dynamic changes to a file or directory after an initial caching will not be picked up. Did you run the test suite with this patch as it should have failed.
msg199401 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-10-10 17:54
Is the content of the bootstrap module used after the interpreter is boot strapped? I see ... that's a problem. It's a proof of concept anyway and the speed up is minimal. On my computer with a SSD the speedup barely measurable. I'd like to see if it makes a difference on a Raspbarry Pi or a NFS shares

I have another idea, too. Could we add an optional 'stat' argument to __init__() of FileLoader and ExtensionFileLoader so we can pass the stat object around and reuse it for loading?
msg199410 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-10 20:34
importlib/_bootstrap.py is importlib, period, so there is no separation of what is used to start Python and what is used after interpreter startup is completed.

As for adding a 'stat' argument to the loaders, it's possible but as always it comes down to whether it will break someone or not. Since loaders do not necessarily execute immediately you are running the risk of a very stale cached stat object. Plus Eric Snow has his PEP where the API in terms of loader __init__ signature so you would want to look into that.
msg199424 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-10 22:11
With ModuleSpec (PEP 451), the finder creates the spec object (where it stores the loader).  At that point the finder is free to store any stat object you like in spec.loader_state.  The spec is made available to the loader during exec (if the loader supports it, which the importlib loaders will).  So there is no need to add anything to any loader __init__.

The only catch is the slim possibility that the stat object will be stale by the time it gets used.  I seem to remember a case where something like this happened (related to distros building their system Python or something).
msg199430 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-11 00:53
For interpreter startup, stats are not involved for builtin and frozen modules[1].  They are tied to imports that involve traversing sys.path (a.k.a. PathFinder).  Most stats happen in FileFinder.find_loader.  The remainder are for source (.py) files (a.k.a. SourceFileLoader).

Here's a rough sketch of what typically happens currently during the import of a path-based module[2], as related to stats (and other FS access):

(lines with FS access start with *)

def load_module(fullname):
    suffixes = ['.cpython-34m.so', '.abi3.so', '.so', '.py', '.pyc']
    tailname = fullname.rpartition('.')[2]
    for entry in sys.path:
*       mtime = os.stat(entry).st_mtime
        if mtime != cached_mtime:
*           cached_listdir = os.listdir(entry)
        if tailname in cached_listdir:
            basename = entry/tailname
*           if os.stat(basename).st_mode implies directory:  # superfluous?
                # package?
                for suffix in suffixes:
                    full_path = basename + suffix
*                   if os.stat(full_path).st_mode implies file:
                        if is_extension:
*                           <dlopen>(full_path)
                        elif is_sourceless:
*                           open(full_path).read()
                        else:
                            load_from_source(full_path)
                        return
        # ...non-package module?
        for suffix in suffixes:
            full_path = entry/tailname + suffix
            if tailname + suffix in cached_listdir:
*               if os.stat(full_path).st_mode implies file:  # superfluous?
                    if is_extension:
*                       <dlopen>(full_path)
                    elif is_sourceless:
*                       open(full_path).read()
                    else:
                        load_from_source(full_path)

def load_from_source(sourcepath):
*   st = os.stat(sourcepath)
    if st:
*       open(bytecodepath).read()
    else:
*       open(sourcepath).read()
*       os.stat(sourcepath).st_mode
        for parent in ancestor_dirs(sourcepath):
*           os.stat(parent).st_mode  ->  missing_parents
        for parent in missing_parents:
*           os.mkdir(parent)
*       open(tempname).write()
*       os.replace(tempname, bytecodepath)


Obviously there are some unix-isms in there.  Windows ends up not that different though.


stat/FS count
-------------

load_module (*per path entry*):
    (add 1 listdir to each if the cache is stale)
    not found: 1 stat
    non-package dir: 7 (num_suffixes + 2 stats)

    package (best): 4/5-9+ (3 stats, 1 read or load_from_source)
    package (worst): 8/9-13+ (num_suffixes + 2 stats, 1 read or load_from_source)
    non-package module 3/4-8+ (best): (2 stats, 1 read or load_from_source)
    non-package module 7/8-12+ (worst): (num_suffixes + 1 stats, 1 read or load_from_source)
    non-package module + dir (best): 10/11-15+ (num_suffixes + 4 stats, 1 read or load_from_source)
    non-package module + dir (best): 14/15-19+ (num_suffixes * 2 + 3 stats, 1 read or load_from_source)

load_from_source:
    cached: 2 (1 stat, 1 read)
    uncached, no parents: 4 (2 stats, 1 write, 1 replace)
    uncached, no missing parents: 5+ (num_parents + 2 stats, 1 write, 1 replace)
    uncached, missing parents: 6+ (num_parents + 2 stats, num_missing mkdirs, 1 write, 1 replace)


Highlights:

* the common case is not fast (for the sake of the slight possibility that files may change between imports)--not as much an issue during interpreter startup.
* up to 5 different suffixes with a separate stat for each (with extension module suffixes tried first).
* the size and ordering of sys.path has a decided impact on # stats.
* if a module is cached, a lot less FS access happens.
* the more nested a module, the more access happen.
* namespace packages don't have much impact on performance.

Possible improvements:

* provide an internal mechanism to turn on/off caching all stats (don't worry about staleness) and maybe expose it via a context manager/API. (not unlike what Christian put in his patch.)
* at least do some temporally local caching where the risk of staleness is particularly small.
* Move .py ahead of extension modules (or just behind .cpython-34m.so)?
* non-packages are more common than packages (?) so look for those first (hard to make effective without breaking key import semantics).
* remove 2 possibly superfluous stats?


[1] Maybe we should freeze the stdlib. <0.5 wink>
[2] importing a module usually involves importing the module's parent and its parent and so forth.  Each of those incurs the same stat hits all over again (though usually packages have only 1 path entry to traverse).  The stdlib is pretty flat (particularly among modules involved during startup) so this is less of an issue for this ticket.
msg199431 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-11 01:02
So the 2 stat calls in the general case are superfluous, it's just a question of whether they make any performance difference. Turns out that at least on my Macbook their is no performance difference and thus not worth the cost of breaking semantics over it: http://bugs.python.org/issue18810 .

As for completely turning off stat calls during interpreter startup, that would definitely buy us something, but the question is how much and how do we make it work reliably?
msg199436 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-11 01:46
I realized those two stats are not superfluous in the case that a directory name has a .py suffix or a file doesn't have any suffix.  However, I expect that's pretty uncommon.

Worst case, these cases cost 2 stats per path entry.  In practice they cost nothing due to the dir caching we already do.
msg199437 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-11 01:54
I forgot to mention that optimizing the default composition of sys.path (from site) could help speed things up, though it might already be optimized in that regard.

I also forgot to mention the idea of zipping up the stdlib.

Sorry for the sidetrack.  Now, back to the stat discussion...
msg200496 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-19 21:01
The real problem here is that the definition of "bootstrap" or "startup" is fuzzy. How do you decide when you stop caching?
The only workable approach IMO is to adopt a time-based heuristic, which I did in issue14067.
msg200498 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-19 21:12
Would it be feasible to have an explicit (but private?) flag in importlib indicating stat checking (or even all FS checking) should be disabled, defaulting to True?  runpy could set it to False after initializing importlib and then back to True when startup is done.

If that was useful for more than just startup, we could also add a contextmanager for it in importlib.util.
msg200500 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-19 21:14
> Would it be feasible to have an explicit (but private?) flag in
> importlib indicating stat checking (or even all FS checking) should be
> disabled, defaulting to True?  runpy could set it to False after
> initializing importlib and then back to True when startup is done.

I don't really understand the algorithm you're proposing. Also, have you
read what I've just posted?
msg200502 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-10-19 21:37
> I don't really understand the algorithm you're proposing.

In importlib._bootstrap:

We have some global like "_CHECK_STAT=True".  FileFinder would use it to decide on using stat checks or not.

In Python/pythonrun.c:

At the end of import_init(), we set importlib._bootstrap _CHECK_STAT to False.  Then at the end of _Py_InitializeEx_Private() we set it back to True.

(As an alternative, we could always not do stat checking for just the standard library)

> Also, have you read what I've just posted?

About the fuzziness of when startup is finished?  As implied above, I'd say at the end of Py_Initialize().
msg200503 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-19 21:45
> > Also, have you read what I've just posted?
> 
> About the fuzziness of when startup is finished?  As implied above,
> I'd say at the end of Py_Initialize().

You only have imported a handful of modules by then. Real-world
applications will import many more afterwards.
Here's a little experiment (done with a system install of Python 2.7):

$ python -v -c pass 2>&1 | grep "^import" | wc -l
33
$ python -v `which hg` 2>&1 | grep "^import" | wc -l
117

Note that Mercurial has a lazy importer in order to improve startup
time, otherwise the number would be higher yet.
msg342550 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-05-15 03:13
The benefit of avoiding stat() calls seems to not be obvious to everybody. Moreover, importlib now implements a "path cache". I close the issue.

The most efficient solution is to pack all your modules and the Python stdlib into a ZIP file: everything is done in memory, no more filesystem access.
History
Date User Action Args
2022-04-11 14:57:51adminsetgithub: 63415
2019-05-15 03:13:54vstinnersetstatus: open -> closed
resolution: rejected
messages: + msg342550

stage: patch review -> resolved
2013-12-22 17:46:58pitrousetversions: + Python 3.5, - Python 3.4
2013-10-19 21:45:34pitrousetmessages: + msg200503
2013-10-19 21:37:26eric.snowsetmessages: + msg200502
2013-10-19 21:14:42pitrousetmessages: + msg200500
2013-10-19 21:12:26eric.snowsetmessages: + msg200498
2013-10-19 21:02:55Arfreversetnosy: + Arfrever
2013-10-19 21:01:06pitrousetmessages: + msg200496
2013-10-18 14:23:16brett.cannonsetassignee: brett.cannon ->
2013-10-11 01:54:58eric.snowsetmessages: + msg199437
2013-10-11 01:46:32eric.snowsetmessages: + msg199436
2013-10-11 01:02:39brett.cannonsetmessages: + msg199431
2013-10-11 00:53:46eric.snowsetmessages: + msg199430
2013-10-10 22:11:24eric.snowsetnosy: + eric.snow
messages: + msg199424
2013-10-10 20:34:10brett.cannonsetmessages: + msg199410
2013-10-10 17:54:04christian.heimessetmessages: + msg199401
2013-10-10 16:50:31brett.cannonsetmessages: + msg199400
2013-10-10 14:07:25pitrousetnosy: + pitrou
messages: + msg199387
2013-10-10 13:34:50vstinnersetnosy: + vstinner
messages: + msg199385
2013-10-10 13:05:42barrysetnosy: + barry
2013-10-10 11:49:01r.david.murraysetnosy: + r.david.murray
2013-10-10 11:20:50christian.heimescreate