Message78306
This patch implements what is usually called "threaded code" for the
ceval loop on compilers which support it (only gcc). The idea is that
there is a separate opcode dispatch epilog at the end of each opcode,
which allows the CPU to make much better use of its branch prediction
capabilities. The net result is a 15-20% average speedup on pybench and
pystone, with higher speedups on very tight loops (see below for the
full pybench result chart).
The opcode jump table is generated by a separate script which is called
as part of the Makefile (just as the asdl building script already is).
On compilers other than gcc, performance will of course be unchanged.
Test minimum run-time average run-time
this other diff this other
diff
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 100ms 107ms -7.1% 101ms 110ms
-8.2%
BuiltinMethodLookup: 76ms 106ms -28.1% 78ms 106ms
-26.5%
CompareFloats: 108ms 141ms -23.2% 108ms 141ms
-23.2%
CompareFloatsIntegers: 171ms 188ms -9.4% 173ms 204ms
-15.3%
CompareIntegers: 165ms 213ms -22.6% 168ms 224ms
-25.1%
CompareInternedStrings: 127ms 169ms -24.6% 127ms 169ms
-24.8%
CompareLongs: 95ms 124ms -23.1% 95ms 126ms
-24.5%
CompareStrings: 109ms 136ms -20.2% 111ms 139ms
-19.9%
ComplexPythonFunctionCalls: 131ms 150ms -12.4% 136ms 151ms
-10.2%
ConcatStrings: 159ms 171ms -6.9% 160ms 173ms
-7.4%
CreateInstances: 148ms 157ms -5.6% 150ms 158ms
-4.9%
CreateNewInstances: 112ms 117ms -4.3% 112ms 118ms
-4.6%
CreateStringsWithConcat: 144ms 198ms -27.3% 148ms 199ms
-25.7%
DictCreation: 90ms 104ms -13.3% 90ms 104ms
-13.1%
DictWithFloatKeys: 117ms 153ms -23.7% 117ms 154ms
-24.0%
DictWithIntegerKeys: 104ms 153ms -32.3% 104ms 154ms
-32.5%
DictWithStringKeys: 90ms 140ms -35.7% 90ms 141ms
-36.3%
ForLoops: 100ms 161ms -38.1% 100ms 161ms
-38.1%
IfThenElse: 123ms 170ms -28.0% 125ms 171ms
-27.1%
ListSlicing: 142ms 141ms +0.3% 142ms 142ms
+0.2%
NestedForLoops: 135ms 190ms -29.0% 135ms 190ms
-29.0%
NormalClassAttribute: 249ms 281ms -11.5% 249ms 281ms
-11.3%
NormalInstanceAttribute: 110ms 153ms -28.2% 111ms 154ms
-28.1%
PythonFunctionCalls: 106ms 130ms -18.7% 108ms 131ms
-17.2%
PythonMethodCalls: 151ms 169ms -10.1% 152ms 169ms
-9.8%
Recursion: 183ms 242ms -24.7% 191ms 243ms
-21.4%
SecondImport: 142ms 138ms +2.7% 144ms 139ms
+3.4%
SecondPackageImport: 146ms 149ms -2.3% 148ms 150ms
-1.5%
SecondSubmoduleImport: 201ms 193ms +3.9% 201ms 195ms
+3.4%
SimpleComplexArithmetic: 90ms 112ms -19.6% 90ms 112ms
-19.8%
SimpleDictManipulation: 172ms 230ms -25.2% 173ms 231ms
-25.0%
SimpleFloatArithmetic: 98ms 133ms -26.3% 99ms 137ms
-27.9%
SimpleIntFloatArithmetic: 134ms 175ms -23.6% 138ms 176ms
-21.6%
SimpleIntegerArithmetic: 134ms 183ms -26.8% 141ms 183ms
-23.1%
SimpleListManipulation: 91ms 143ms -36.3% 93ms 143ms
-35.1%
SimpleLongArithmetic: 88ms 108ms -17.9% 91ms 109ms
-16.2%
SmallLists: 127ms 162ms -21.6% 129ms 164ms
-21.2%
SmallTuples: 149ms 177ms -15.6% 151ms 178ms
-15.1%
SpecialClassAttribute: 423ms 426ms -0.7% 426ms 430ms
-0.9%
SpecialInstanceAttribute: 110ms 154ms -28.2% 111ms 154ms
-28.3%
StringMappings: 428ms 443ms -3.4% 432ms 449ms
-3.7%
StringPredicates: 124ms 161ms -23.1% 125ms 162ms
-22.7%
StringSlicing: 207ms 223ms -7.1% 208ms 228ms
-8.7%
TryExcept: 72ms 166ms -56.3% 73ms 166ms
-56.2%
TryFinally: 93ms 120ms -22.9% 93ms 124ms
-25.2%
TryRaiseExcept: 52ms 64ms -19.2% 52ms 65ms
-19.2%
TupleSlicing: 177ms 195ms -9.1% 178ms 198ms
-10.2%
WithFinally: 147ms 163ms -10.2% 147ms 164ms
-10.1%
WithRaiseExcept: 156ms 173ms -10.1% 157ms 174ms
-9.7%
-------------------------------------------------------------------------------
Totals: 6903ms 8356ms -17.4% 6982ms 8443ms
-17.3% |
|
Date |
User |
Action |
Args |
2008-12-26 21:09:39 | pitrou | set | recipients:
+ pitrou |
2008-12-26 21:09:38 | pitrou | set | messageid: <1230325778.98.0.752974375077.issue4753@psf.upfronthosting.co.za> |
2008-12-26 21:09:38 | pitrou | link | issue4753 messages |
2008-12-26 21:09:35 | pitrou | create | |
|