Message280557
I tried different patches and ran many quick & dirty benchmarks.
I tried to use likely/unlikely macros (using GCC __builtin__expect): the effect is not significant on call_simple microbenchmark. I gave up on this part.
__attribute__((hot)) on a few Python core functions fixes the major slowdown on call_method on the revision 83877018ef97 (described in the initial message).
I noticed tiny differences when using __attribute__((hot)), speedup in most cases. I noticed sometimes slowdown, but very small (ex: 1%, but 1% on a microbenchmark doesn't mean anything).
I pushed my patch to try to keep stable performance when Python is not compiled with PGO.
If you would like to know more about the crazy effect of code placement in modern Intel CPUs, I suggest you to see the slides of this recent talk from an Intel engineer:
https://llvmdevelopersmeetingbay2016.sched.org/event/8YzY/causes-of-performance-instability-due-to-code-placement-in-x86
"Causes of Performance Swings Due to Code Placement in IA by Zia Ansari (Intel), November 2016"
--
About PGO or not PGO: this question is not simple, I suggest to discuss it in a different place to not flood this issue ;-)
For my use case, I'm not convinced yet that PGO with our current build system produce reliable performance.
Not all Linux distributions compile Python using PGO: Fedora and RHEL don't compile Python using PGO for example. Bugzilla for Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=613045
I guess that there also some developers running benchmarks on Python compiled with "./configure && make". I'm trying to enhance documentation and tools around Python benchmarks to advice developers to use LTO and/or PGO. |
|
Date |
User |
Action |
Args |
2016-11-11 01:49:03 | vstinner | set | recipients:
+ vstinner, pitrou, python-dev, serhiy.storchaka |
2016-11-11 01:49:03 | vstinner | set | messageid: <1478828943.79.0.566367046273.issue28618@psf.upfronthosting.co.za> |
2016-11-11 01:49:03 | vstinner | link | issue28618 messages |
2016-11-11 01:49:01 | vstinner | create | |
|