msg199528 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-12 00:33 |
To speedup Python startup, I propose to reimplement the keyword module in C. Attached patch implements it.
* Remove Lib/keyword.py
* Add Tools/scripts/keyword.py: script to build Modules/keyword.h
* The new module is Modules/keyword.c
I chose to change the keyword.kwlist type from list to tuple. The documentation only says they that kwlist is a "Sequence":
http://docs.python.org/dev/library/keyword.html#keyword.kwlist
The keyword is now used by the collections module in the implementation of namedtuple.
The keyword module is trivial: it only contains a sequence of strings and a method checking if a string is part of this sequence. It would be simpler to add this to an existing module. I suppose that a new module was added for technical reasons.
|
msg199530 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-12 01:25 |
I fail to see a difference with my patch in the startup time, but I see that the number of syscalls is reduced by 10 (558 => 548).
|
msg199531 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-12 01:29 |
The patch adds 448 bytes to the stripped python executable file.
|
msg199542 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-10-12 09:20 |
IMO this should be rejected. Failure to improve startup time + more complicated maintenance.
|
msg199547 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2013-10-12 11:33 |
Here is a simpler patch that directly uses the grammar definition to create a list of keywords. It completely removes the necessity of a script.
|
msg199548 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-10-12 11:37 |
-1 again. We shouldn't gratuitously convert Python code to C code.
|
msg199549 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2013-10-12 11:43 |
Well, combined with the fact that it gets rid of a manual regeneration step (that is easy to forget, since adding a keyword is not done very often) I think it's a net gain.
The same could be done with the "token" module BTW.
|
msg199550 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-10-12 11:43 |
> Well, combined with the fact that it gets rid of a manual regeneration step (that is easy to forget, since adding a keyword is not done very often) I think it's a net gain.
If it needs to be automated it can be added to the Makefile...
|
msg199556 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-12 12:28 |
> If it needs to be automated it can be added to the Makefile...
I tested keyword_grammar.patch on a fresh Python source code (make distclean; ./configure --with-pydebug; make): I can compile Python. I don't understand what should be automated? This patch doesn't need to build a dependency.
keyword_grammar.patch needs probably something for Visual Studio (PC/config.c and PCbuild/pythoncore.vcxproj?).
|
msg199577 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2013-10-12 15:28 |
Is there any change in any benchmark?
|
msg201868 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-01 01:19 |
Original:
$ ./python -S -m timeit -s 'import sys' 'import keyword; keyword=None; del sys.modules["keyword"]'
10000 loops, best of 3: 149 usec per loop
Python patched with keyword_grammar.patch:
$ ./python -S -m timeit -s 'import sys' 'import keyword; keyword=None; del sys.modules["keyword"]'
10000 loops, best of 3: 20 usec per loop
The gain is 129 microseconds (import 7.4x faster). Python starts in between 8,850 and 13,800 microseconds on my PC:
$ ./python -S -m timeit -s 'import subprocess; args=[sys.executable, "-S", "-c", "pass"]' 'subprocess.call(args)'
100 loops, best of 3: 8.85 msec per loop
$ ./python -S -m timeit -s 'import subprocess; args=[sys.executable, "-c", "pass"]' 'subprocess.call(args)'
100 loops, best of 3: 13.8 msec per loop
|
msg201869 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-01 01:21 |
"IMO this should be rejected. Failure to improve startup time + more complicated maintenance."
What do you mean by "complicated maintenance"? The C module is short and doesn't use complex functions, I don't expect it to change.
|
msg206021 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-13 02:53 |
So what? Do you want faster Python startup?
|
msg206029 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2013-12-13 09:03 |
I still like the idea, too. It's more elegant, easier to maintain and gives a speedup, too.
|
msg206081 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2013-12-13 13:47 |
Two things. One, no one has shown a startup improvement using the official startup benchmarks (either normal or nosite). That needs to be established to show an actual benefit.
Second, PEP 399 requires the pure Python version stick around, else you need to request an exception to drop keyword.py from python-dev.
I think you have a chance of getting the exception if you can show an actual startup improvement.
|
msg206082 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-13 13:50 |
> Second, PEP 399 requires the pure Python version stick around, else you need to request an exception to drop keyword.py from python-dev.
I don't understand the purpose of providing a Python implementation of the keyword module. If you look at the current implementation, I don't know that anything other than CPython can use it. The code uses Python/graminit.c to regenerate the module.
But I don't want to fight if you don't care of minor speedup, so I'm closing the issue.
|
msg206084 - (view) |
Author: Brett Cannon (brett.cannon) * |
Date: 2013-12-13 13:53 |
You're right, the regeneration might be specific to CPython (although maybe it should work from Grammar/Grammar instead?), but that's just creating the Python file which we check into the stdlib, so any VM can use it, they just can't recreate it (which is fine since every other VM works from a CPython release).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:51 | admin | set | github: 63429 |
2013-12-13 13:53:10 | brett.cannon | set | messages:
+ msg206084 |
2013-12-13 13:50:06 | vstinner | set | status: open -> closed resolution: rejected messages:
+ msg206082
|
2013-12-13 13:47:25 | brett.cannon | set | messages:
+ msg206081 |
2013-12-13 09:03:13 | christian.heimes | set | messages:
+ msg206029 |
2013-12-13 02:53:03 | vstinner | set | messages:
+ msg206021 |
2013-11-01 01:21:29 | vstinner | set | messages:
+ msg201869 |
2013-11-01 01:19:39 | vstinner | set | messages:
+ msg201868 |
2013-10-12 15:28:14 | brett.cannon | set | nosy:
+ brett.cannon messages:
+ msg199577
|
2013-10-12 12:28:43 | vstinner | set | messages:
+ msg199556 |
2013-10-12 11:43:47 | pitrou | set | messages:
+ msg199550 |
2013-10-12 11:43:10 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg199549
|
2013-10-12 11:37:02 | pitrou | set | messages:
+ msg199548 |
2013-10-12 11:33:02 | christian.heimes | set | files:
+ keyword_grammar.patch
messages:
+ msg199547 components:
+ Extension Modules stage: patch review |
2013-10-12 09:20:45 | pitrou | set | messages:
+ msg199542 |
2013-10-12 01:29:24 | vstinner | set | messages:
+ msg199531 |
2013-10-12 01:25:17 | vstinner | set | messages:
+ msg199530 |
2013-10-12 00:33:11 | vstinner | create | |