Message 259670 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	Yury.Selivanov, casevh, josh.r, lemburg, mark.dickinson, pitrou, rhettinger, serhiy.storchaka, skrah, vstinner, yselivanov, zbyrne
Date	2016-02-05.16:15:24
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1454688924.97.0.940226921189.issue21955@psf.upfronthosting.co.za>
In-reply-to

Content
My analysis of benchmarks. Even using CPU isolation to run benchmarks, the results look unreliable for very short benchmarks like 3 ** 2.0: I don't think that fastint_alt can make the operation 16% slower since it doesn't touch this code, no? Well... as expected, speedup is quite small: the largest difference is on "3 * 2" ran 100 times: 18% faster with fastint_alt. We are talking about 1.82 us => 1.49 us: delta of 330 ns. I expect a much larger difference is you compile a function to machine code using Cython or a JIT like Numba and PyPy. Remember that we are running micro-benchmarks, so we should not push overkill optimizations except if the speedup is really impressive. It's quite obvious from the tables than fastint_alt.patch only optimize int (float is not optimized). If we choose to optimize float too, fastintfloat_alt.patch and fastint5.patch look to have the same speed. I don't see any overhead on Decimal + Decimal with any patch: good. -- Between fastintfloat_alt.patch and fastint5.patch, I prefer fastintfloat_alt.patch which is much easier to read, so probably much easier to debug. I hate huge macro when I have to debug code in gdb :-( I also like very much the idea of reusing existing functions, rather than duplicating code. Even if Antoine doesn't seem interested by optimizations on float, I think that it's ok to add a few lines for this type, fastintfloat_alt.patch is not so complex. What do you think? Why not optimizing ab? It's a common operation, especially 2k, no?

My analysis of benchmarks.

Even using CPU isolation to run benchmarks, the results look unreliable for very short benchmarks like 3 ** 2.0: I don't think that fastint_alt can make the operation 16% slower since it doesn't touch this code, no?

Well... as expected, speedup is quite *small*: the largest difference is on "3 * 2" ran 100 times: 18% faster with fastint_alt. We are talking about 1.82 us => 1.49 us: delta of 330 ns. I expect a much larger difference is you compile a function to machine code using Cython or a JIT like Numba and PyPy. Remember that we are running *micro*-benchmarks, so we should not push overkill optimizations except if the speedup is really impressive.

It's quite obvious from the tables than fastint_alt.patch only optimize int (float is not optimized). If we choose to optimize float too, fastintfloat_alt.patch and fastint5.patch look to have the *same* speed.

I don't see any overhead on Decimal + Decimal with any patch: good.

Between fastintfloat_alt.patch and fastint5.patch, I prefer fastintfloat_alt.patch which is much easier to read, so probably much easier to debug. I hate huge macro when I have to debug code in gdb :-( I also like very much the idea of *reusing* existing functions, rather than duplicating code.

Even if Antoine doesn't seem interested by optimizations on float, I think that it's ok to add a few lines for this type, fastintfloat_alt.patch is not so complex. What do *you* think?

Why not optimizing a**b? It's a common operation, especially 2**k, no?

History
Date	User	Action	Args
2016-02-05 16:15:25	vstinner	set	recipients: + vstinner, lemburg, rhettinger, mark.dickinson, pitrou, casevh, skrah, Yury.Selivanov, serhiy.storchaka, yselivanov, josh.r, zbyrne
2016-02-05 16:15:24	vstinner	set	messageid: <1454688924.97.0.940226921189.issue21955@psf.upfronthosting.co.za>
2016-02-05 16:15:24	vstinner	link	issue21955 messages
2016-02-05 16:15:24	vstinner	create