Message 259562 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	yselivanov
Recipients	casevh, josh.r, lemburg, mark.dickinson, pitrou, rhettinger, serhiy.storchaka, vstinner, yselivanov, zbyrne
Date	2016-02-04.13:54:55
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1454594095.73.0.393687709596.issue21955@psf.upfronthosting.co.za>
In-reply-to

Content
> I agree with Marc-Andre, people doing FP-heavy math in Python use Numpy (possibly with Numba, Cython or any other additional library). Micro-optimizing floating-point operations in the eval loop makes little sense IMO. I disagree. 30% faster floats (sic!) is a serious improvement, that shouldn't just be discarded. Many applications have floating point calculations one way or another, but don't use numpy because it's an overkill. Python 2 is much faster than Python 3 on any kind of numeric calculations. This point is being frequently brought up in every python2 vs 3 debate. I think it's unacceptable. > * the ceval loop may no longer fit in to the CPU cache on systems with small cache sizes, since the compiler will likely inline all the fast_() functions (I guess it would be possible to simply eliminate all fast paths using a compile time flag) That's a speculation. It may still fit. Or it had never really fitted, it's already huge. I tested the patch on a 8 year old desktop CPU, no performance degradation on our benchmarks. ### raytrace ### Avg: 1.858527 -> 1.652754: 1.12x faster ### nbody ### Avg: 0.310281 -> 0.285179: 1.09x faster ### float ### Avg: 0.392169 -> 0.358989: 1.09x faster ### chaos ### Avg: 0.355519 -> 0.326400: 1.09x faster ### spectral_norm ### Avg: 0.377147 -> 0.303928: 1.24x faster ### telco ### Avg: 0.012845 -> 0.013006: 1.01x slower The last benchmark (telco) is especially interesting. It uses decimals for calculation, that means that it uses overloaded numeric operators. Still no significant performance degradation. > maintenance will get more difficult Fast path for floats is easy to understand for every core dev that works with ceval, there is no rocket science there (if you want rocket science that is hard to maintain look at generators/yield from). If you don't like inlining floating point calculations, we can make float_add, float_sub, float_div, and float_mul exported and use them in ceval. Why not combine my patch and Serhiy's? First we check if left & right are both longs. Then we check if they are unicode (for +). And then we have a fastpath for floats.

> I agree with Marc-Andre, people doing FP-heavy math in Python use Numpy (possibly with Numba, Cython or any other additional library). Micro-optimizing floating-point operations in the eval loop makes little sense IMO.

I disagree.

30% faster floats (sic!) is a serious improvement, that shouldn't just be discarded. Many applications have floating point calculations one way or another, but don't use numpy because it's an overkill.

Python 2 is much faster than Python 3 on any kind of numeric calculations. This point is being frequently brought up in every python2 vs 3 debate. I think it's unacceptable.

> * the ceval loop may no longer fit in to the CPU cache on
systems with small cache sizes, since the compiler will likely
inline all the fast_*() functions (I guess it would be possible
to simply eliminate all fast paths using a compile time
flag)

That's a speculation. It may still fit. Or it had never really fitted, it's already huge. I tested the patch on a 8 year old desktop CPU, no performance degradation on our benchmarks.

### raytrace ###
Avg: 1.858527 -> 1.652754: 1.12x faster

### nbody ###
Avg: 0.310281 -> 0.285179: 1.09x faster

### float ###
Avg: 0.392169 -> 0.358989: 1.09x faster

### chaos ###
Avg: 0.355519 -> 0.326400: 1.09x faster

### spectral_norm ###
Avg: 0.377147 -> 0.303928: 1.24x faster

### telco ###
Avg: 0.012845 -> 0.013006: 1.01x slower

The last benchmark (telco) is especially interesting. It uses decimals for calculation, that means that it uses overloaded numeric operators. Still no significant performance degradation.

> * maintenance will get more difficult

Fast path for floats is easy to understand for every core dev that works with ceval, there is no rocket science there (if you want rocket science that is hard to maintain look at generators/yield from). If you don't like inlining floating point calculations, we can make float_add, float_sub, float_div, and float_mul exported and use them in ceval.

Why not combine my patch and Serhiy's? First we check if left & right are both longs. Then we check if they are unicode (for +). And then we have a fastpath for floats.

History
Date	User	Action	Args
2016-02-04 13:54:55	yselivanov	set	recipients: + yselivanov, lemburg, rhettinger, mark.dickinson, pitrou, vstinner, casevh, serhiy.storchaka, josh.r, zbyrne
2016-02-04 13:54:55	yselivanov	set	messageid: <1454594095.73.0.393687709596.issue21955@psf.upfronthosting.co.za>
2016-02-04 13:54:55	yselivanov	link	issue21955 messages
2016-02-04 13:54:55	yselivanov	create