Message259670
My analysis of benchmarks.
Even using CPU isolation to run benchmarks, the results look unreliable for very short benchmarks like 3 ** 2.0: I don't think that fastint_alt can make the operation 16% slower since it doesn't touch this code, no?
Well... as expected, speedup is quite *small*: the largest difference is on "3 * 2" ran 100 times: 18% faster with fastint_alt. We are talking about 1.82 us => 1.49 us: delta of 330 ns. I expect a much larger difference is you compile a function to machine code using Cython or a JIT like Numba and PyPy. Remember that we are running *micro*-benchmarks, so we should not push overkill optimizations except if the speedup is really impressive.
It's quite obvious from the tables than fastint_alt.patch only optimize int (float is not optimized). If we choose to optimize float too, fastintfloat_alt.patch and fastint5.patch look to have the *same* speed.
I don't see any overhead on Decimal + Decimal with any patch: good.
--
Between fastintfloat_alt.patch and fastint5.patch, I prefer fastintfloat_alt.patch which is much easier to read, so probably much easier to debug. I hate huge macro when I have to debug code in gdb :-( I also like very much the idea of *reusing* existing functions, rather than duplicating code.
Even if Antoine doesn't seem interested by optimizations on float, I think that it's ok to add a few lines for this type, fastintfloat_alt.patch is not so complex. What do *you* think?
Why not optimizing a**b? It's a common operation, especially 2**k, no? |
|
Date |
User |
Action |
Args |
2016-02-05 16:15:25 | vstinner | set | recipients:
+ vstinner, lemburg, rhettinger, mark.dickinson, pitrou, casevh, skrah, Yury.Selivanov, serhiy.storchaka, yselivanov, josh.r, zbyrne |
2016-02-05 16:15:24 | vstinner | set | messageid: <1454688924.97.0.940226921189.issue21955@psf.upfronthosting.co.za> |
2016-02-05 16:15:24 | vstinner | link | issue21955 messages |
2016-02-05 16:15:24 | vstinner | create | |
|