Issue11051
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-01-28 22:49 by pitrou, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
import_stat.diff | neologix, 2011-12-05 18:32 | review |
Messages (17) | |||
---|---|---|---|
msg127359 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-01-28 22:49 | |
Here are the system calls when importing a single pure Python module: stat("/home/antoine/py3k/py3k/Lib/copyreg", 0x7fff1ed1f740) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyreg.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyregmodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyreg.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyregmodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyreg.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyregmodule.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/copyreg.py", O_RDONLY) = 5 fstat(5, {st_mode=S_IFREG|0664, st_size=6633, ...}) = 0 open("/home/antoine/py3k/py3k/Lib/__pycache__/copyreg.cpython-32.pyc", O_RDONLY) = 6 fstat(6, {st_mode=S_IFREG|0664, st_size=5595, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f28e6381000 read(6, "l\f\r\n\0\24\212Hc\0\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0@\0\0\0s\307\0"..., 4096) = 4096 fstat(6, {st_mode=S_IFREG|0664, st_size=5595, ...}) = 0 This is per possible directory, which means that it can be repeated for each dir on sys.path: stat("readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory) open("readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readline.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("readline.py", O_RDONLY) = -1 ENOENT (No such file or directory) open("readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/home/antoine/py3k/py3k/Lib/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readline.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readline.py", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/home/antoine/py3k/py3k/Lib/plat-linux2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 stat("/home/antoine/py3k/py3k/Lib/plat-linux2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 stat("/home/antoine/py3k/py3k/Lib/plat-linux2/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.cpython-32m.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.abi3.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readlinemodule.so", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.py", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/Lib/plat-linux2/readline.pyc", O_RDONLY) = -1 ENOENT (No such file or directory) stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0 stat("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline", 0x7fff1ed28760) = -1 ENOENT (No such file or directory) open("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline.cpython-32m.so", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0775, st_size=59970, ...}) = 0 futex(0x7f28e605e0ec, FUTEX_WAKE_PRIVATE, 2147483647) = 0 open("/home/antoine/py3k/py3k/build/lib.linux-x86_64-3.2/readline.cpython-32m.so", O_RDONLY) = 4 Something has gone out of control here. Why do we need to check so many alternative locations? |
|||
msg127522 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2011-01-30 09:19 | |
> Something has gone out of control here. Why do we need to check so many alternative locations? What change do you propose? |
|||
msg127523 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-01-30 10:56 | |
> > Something has gone out of control here. Why do we need to check so many alternative locations? > > What change do you propose? First, I don't understand why we need to check both "foo.so" and "foomodule.so". Second, I don't understand why we need to check both "foo.cpython32.so" and "foo.so". If you fix both these, the number of stat() calls per would-be extension module goes down from 6 to 2 per directory path. Then there seems to be a couple of redundant stat/fstat calls on some files. |
|||
msg148884 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-12-05 18:32 | |
Here's a trivial patch reducing the number of calls to open. before: """ $ strace -c -e open ./python -c "" % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000049 0 392 306 open ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000049 392 306 total """ after: """ $ strace -c -e open ./python -c "" % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000024 0 86 open ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000024 86 total """ As for the flury of tentative locations, I don't feel like modifying this since I'm not familiar enough with the import machinery. |
|||
msg148890 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2011-12-05 21:42 | |
> First, I don't understand why we need to check both "foo.so" and > "foomodule.so". Because we always did, i.e. changing it now may break backwards compatibility. Now, as for why we always had *module.so also: it may be that calling an extension module foo.so might not be feasible if it needs to link with a system library called foo.so (rather than libfoo.so). I don't know how serious this problem is in practice - we may consider deprecating-then-removing *module.so from the import machinery. As for checking both "foo.so" and "foo.cpython-32m.so": that's for backwards compatibility also. Existing build processes may produce foo.so, as they did in previous Python versions (in particular, if they are not based on distutils, but, say, make). Unfortunately, PEP 3149 did not specify a deprecation scheduling. You could ask Barry - I suppose that was an oversight, and not actually deliberate. Of course, with .cpython-32m.so being present, removing *module.so would be easy enough, except again for breaking existing projects which might use that name. |
|||
msg148895 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-12-06 00:33 | |
> Here's a trivial patch reducing the number of calls to open. Looks good to me. |
|||
msg148981 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2011-12-07 18:16 | |
New changeset a541bda2f5e2 by Charles-François Natali in branch 'default': Issue #11051: Reduce the number of syscalls per import. http://hg.python.org/cpython/rev/a541bda2f5e2 |
|||
msg148982 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2011-12-07 18:27 | |
I would have appreciated had you considered my review before pushing that change. |
|||
msg148983 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2011-12-07 18:43 | |
> I would have appreciated had you considered my review before pushing > that change. Sorry, I didn't receive an email notification for your review, so I didn't know you had done one. I'll add a comment. |
|||
msg148998 - (view) | Author: Terry J. Reedy (terry.reedy) * ![]() |
Date: 2011-12-07 22:20 | |
I am a little puzzled by the patch. In logic, 'A and B' is equivalent to 'not A or not B'. But in the patch, - if (_Py_stat(filename, &statbuf) == 0 && /it exists */ - S_ISDIR(statbuf.st_mode)) /* it's a directory */ + if (_Py_stat(filename, &statbuf) != 0 || S_ISDIR(statbuf.st_mode)) you seem to change to 'not A or B', without negating B. Is this intentional? What am I missing? Some subtle effect of lazy evaluation? Or an intentional change in the logic? |
|||
msg148999 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-12-07 22:23 | |
By the way, _Py_stat() can fail because of a Python error: -1 result is not handled in import.c :-( It may leak to the "XXX undetected error" message. |
|||
msg149000 - (view) | Author: Terry J. Reedy (terry.reedy) * ![]() |
Date: 2011-12-07 22:30 | |
OK, I gather from the comment added in the second patch that you intentionally changed the logic to restart the loop more often. |
|||
msg153642 - (view) | Author: Charles-François Natali (neologix) * ![]() |
Date: 2012-02-18 12:22 | |
As noted by Antoine somewhere else (don't remember if it was on the bug tracker or mailing list), most of the startup time is due to site import: """ cf@neobox:~/python/cpython$ time ./python -c '' [44249 refs] real 0m0.445s user 0m0.376s sys 0m0.032s cf@neobox:~/python/cpython$ time ./python -S -c '' [35211 refs] real 0m0.181s user 0m0.144s sys 0m0.016s """ And that's mostly user time, the stat() calls don't weight much here. Here's the output of a profiling: """ Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 2/1 0.006 0.003 0.303 0.303 {built-in method exec} 1 0.004 0.004 0.303 0.303 test_startup.py:1(<module>) 1 0.000 0.000 0.267 0.267 site.py:509(main) 1 0.000 0.000 0.237 0.237 site.py:254(addusersitepackages) 1 0.000 0.000 0.236 0.236 site.py:231(getusersitepackages) 1 0.006 0.006 0.226 0.226 site.py:217(getuserbase) 1 0.011 0.011 0.214 0.214 sysconfig.py:1(<module>) 1 0.004 0.004 0.122 0.122 configparser.py:120(<module>) 30 0.006 0.000 0.120 0.004 {built-in method __build_class__} 9 0.000 0.000 0.116 0.013 re.py:204(compile) 9 0.002 0.000 0.116 0.013 functools.py:185(wrapper) 9 0.000 0.000 0.113 0.013 re.py:256(_compile) 9 0.001 0.000 0.113 0.013 sre_compile.py:488(compile) 1 0.001 0.001 0.096 0.096 configparser.py:555(RawConfigParser) """ On my (somewhat old) laptop, sysconfig import consumes 1/3 of the total startup time. |
|||
msg153693 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2012-02-19 10:15 | |
I changed the title of the issue because what matters is the total startup time, not how it is improved :-) |
|||
msg153697 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2012-02-19 10:35 | |
Using the following patch, "./python -s -c pass" is 2x faster than without the patch! diff --git a/Lib/site.py b/Lib/site.py --- a/Lib/site.py +++ b/Lib/site.py @@ -518,7 +518,8 @@ def main(): known_paths = removeduppaths() if ENABLE_USER_SITE is None: ENABLE_USER_SITE = check_enableusersite() - known_paths = addusersitepackages(known_paths) + if ENABLE_USER_SITE: + known_paths = addusersitepackages(known_paths) known_paths = addsitepackages(known_paths) if sys.platform == 'os2emx': setBEGINLIBPATH() -s with the patch doesn't import the sysconfig module. So if we want to speed up Python, the first step is to optimize it, at least the function getting the user site packages directory. |
|||
msg153736 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2012-02-19 23:47 | |
I opened the issue #14057 for the sysconfig parser. |
|||
msg160876 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-05-16 15:42 | |
This is out of date now that importlib is the default import system. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:12 | admin | set | github: 55260 |
2012-07-22 10:21:18 | flox | set | nosy:
+ flox |
2012-05-16 15:42:09 | pitrou | set | status: open -> closed resolution: out of date messages: + msg160876 |
2012-02-19 23:47:36 | vstinner | set | messages: + msg153736 |
2012-02-19 10:35:35 | vstinner | set | messages: + msg153697 |
2012-02-19 10:15:35 | vstinner | set | messages:
+ msg153693 title: system calls per import -> Improve Python 3.3 startup time |
2012-02-19 08:33:29 | eric.araujo | set | nosy:
+ eric.araujo |
2012-02-18 12:22:58 | neologix | set | messages: + msg153642 |
2011-12-07 22:30:56 | terry.reedy | set | messages: + msg149000 |
2011-12-07 22:23:09 | vstinner | set | nosy:
+ vstinner messages: + msg148999 |
2011-12-07 22:20:28 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg148998 |
2011-12-07 18:43:32 | neologix | set | messages: + msg148983 |
2011-12-07 18:27:09 | loewis | set | messages: + msg148982 |
2011-12-07 18:16:32 | python-dev | set | nosy:
+ python-dev messages: + msg148981 |
2011-12-07 01:45:35 | Arfrever | set | nosy:
+ Arfrever |
2011-12-06 00:33:42 | pitrou | set | messages: + msg148895 |
2011-12-05 21:42:50 | loewis | set | messages: + msg148890 |
2011-12-05 20:37:31 | eric.snow | set | nosy:
+ eric.snow |
2011-12-05 18:32:51 | neologix | set | files:
+ import_stat.diff nosy: + neologix messages: + msg148884 keywords: + patch |
2011-01-30 22:13:30 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
2011-01-30 10:56:18 | pitrou | set | nosy:
loewis, barry, georg.brandl, rhettinger, pitrou, eric.smith, nadeem.vawda messages: + msg127523 |
2011-01-30 09:38:06 | georg.brandl | set | nosy:
+ georg.brandl |
2011-01-30 09:30:06 | rhettinger | set | nosy:
+ rhettinger |
2011-01-30 09:19:20 | loewis | set | nosy:
loewis, barry, pitrou, eric.smith, nadeem.vawda messages: + msg127522 |
2011-01-30 09:04:50 | nadeem.vawda | set | nosy:
+ nadeem.vawda |
2011-01-28 23:04:57 | eric.smith | set | nosy:
+ eric.smith |
2011-01-28 22:49:32 | pitrou | create |