Message259530
Attaching a second version of the patch. (BTW, Serhiy, I tried your idea of using a switch statement to optimize branches (https://github.com/1st1/cpython/blob/fastint2/Python/ceval.c#L5390) -- no detectable speed improvement).
I decided to add fast path for floats & single-digit longs and their combinations. +, -, *, /, //, and their inplace versions are optimized now.
I'll have a full result of macro-benchmarks run tomorrow morning, but here's a result for spectral_norm (rigorous run, best of 3):
### spectral_norm ###
Min: 0.300269 -> 0.233037: 1.29x faster
Avg: 0.301700 -> 0.234282: 1.29x faster
Significant (t=399.89)
Stddev: 0.00147 -> 0.00083: 1.7619x smaller
Some nano-benchmarks (best of 3):
-m timeit -s "loops=tuple(range(100))" "sum([x + x + 1 for x in loops])"
2.7 7.23 3.5 8.17 3.6 7.57
-m timeit -s "loops=tuple(range(100))" "sum([x + x + 1.0 for x in loops])"
2.7 9.08 3.5 11.7 3.6 7.22
-m timeit -s "loops=tuple(range(100))" "sum([x/2.2 + 2 + x*2.5 + 1.0 for x in loops])"
2.7 17.9 3.5 24.3 3.6 11.8 |
|
Date |
User |
Action |
Args |
2016-02-04 06:02:50 | yselivanov | set | recipients:
+ yselivanov, lemburg, rhettinger, mark.dickinson, pitrou, vstinner, casevh, serhiy.storchaka, josh.r, zbyrne |
2016-02-04 06:02:49 | yselivanov | set | messageid: <1454565769.63.0.45194845637.issue21955@psf.upfronthosting.co.za> |
2016-02-04 06:02:49 | yselivanov | link | issue21955 messages |
2016-02-04 06:02:48 | yselivanov | create | |
|