classification
Title: Stop doing stat calls in importlib.machinery.FileFinder to see if something is a file or folder
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: Arfrever, brett.cannon, eric.snow, python-dev
Priority: normal Keywords: patch

Created on 2013-08-22 14:49 by brett.cannon, last changed 2013-11-01 14:40 by brett.cannon. This issue is now closed.

Files
File name Uploaded Description Edit
less_stats.diff brett.cannon, 2013-08-29 17:58 review
Messages (11)
msg195900 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-22 14:49
If the check was done based on simply the format what was being searched for (e.g. just assume it's a file if "module.py" exists in the directory) then a couple of stat calls per search could be saved.

If that is deemed to dangerous due to backwards-compatibility, at least extract an API so people can skip the stat calls if they know they are not going to do something as silly as have something named module.py that is not a file.
msg196018 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-08-23 20:06
Isn't this related somewhat to #7732?
msg196044 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-23 21:58
I did a quick check and at least stripping out the two stat calls for a directory or module file (left in package __init__.py) didn't make a significant difference.
msg196045 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-23 21:59
I don't think it's related to the test_imp bug, but since it was never fully diagnosed I couldn't tell you.
msg196473 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-08-29 17:58
I also just tried using os.listdir() for a package before searching for __init__ and it didn't speed anything up either (actually timeit suggests that os.listdir() is worse than an isdir + 3 * isfile checks). Nor did caching directory stuff at the class level so that they were shared when an os.listdir() was done for an __init__ so that didn't have to be done a second time if the cache wasn't stale already.

I have attached the patch with all of the changes to cut the number of stat calls down significantly, but benchmarking on my work machine shows no definitive benefit.
msg200289 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 17:24
New changeset 11f2f4af1979 by Brett Cannon in branch 'default':
Issue #18810: Be optimistic with stat calls when seeing if a directory
http://hg.python.org/cpython/rev/11f2f4af1979
msg200291 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-18 17:26
The directory savings has actually been handled w/o semantics changes; in the last commit for this issue.

The possibility of leaning on file extensions has been asked on python-dev. Once that is resolved then this issue will either get another commit or simply be closed.
msg200292 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-18 17:29
New changeset 9895a9c20e8a by Brett Cannon in branch 'default':
Add NEWS entry for issue #18810
http://hg.python.org/cpython/rev/9895a9c20e8a
msg200419 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-10-19 13:59
Nick pointed out that the change to FileFinder is possibly going to break subclasses since the os module code doesn't treat '' as the cwd (which is unfortunate) and so people may have been relying on that never being the case. Really shouldn't matter all that much to the typical import though since PathFinder will still pass the full directory down to FileFinder.
msg201904 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-11-01 14:39
Wrong issue
msg201905 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2013-11-01 14:40
I'm going to go ahead and close this. I think the optimistic change is the only one worth making since it's backwards-compatible.
History
Date User Action Args
2013-11-01 14:40:20brett.cannonsetstatus: open -> closed
resolution: fixed
stage: test needed -> resolved
2013-11-01 14:40:08brett.cannonsetmessages: + msg201905
2013-11-01 14:39:36brett.cannonsetdependencies: - Restore empty string special casing in importlib.machinery.FileFinder
messages: + msg201904
2013-11-01 14:20:32brett.cannonsetdependencies: + Restore empty string special casing in importlib.machinery.FileFinder
2013-10-19 18:09:36Arfreversetnosy: + Arfrever
2013-10-19 13:59:48brett.cannonsetmessages: + msg200419
2013-10-18 17:29:32python-devsetmessages: + msg200292
2013-10-18 17:26:03brett.cannonsetmessages: + msg200291
2013-10-18 17:24:43python-devsetnosy: + python-dev
messages: + msg200289
2013-10-18 14:23:34brett.cannonsetassignee: brett.cannon
2013-08-29 17:58:07brett.cannonsetfiles: + less_stats.diff
keywords: + patch
messages: + msg196473
2013-08-23 21:59:12brett.cannonsetmessages: + msg196045
2013-08-23 21:58:49brett.cannonsetmessages: + msg196044
2013-08-23 20:06:18eric.snowsetnosy: + eric.snow
messages: + msg196018
2013-08-22 14:49:46brett.cannoncreate