New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize compilation options #61981
Comments
Ubuntu's system Python 3.3 shows consistently better performance than a vanilla Python 3.3: around 10-15% faster in general (see attached benchmark numbers). If this can be attributed to different compilation options, it would be nice to backport those options to our standard build config. |
most of that can be attributed to the pgo build, which is upstream for a long time. the second thing to do is to build with lto, and see what speedups you get in addition. and it certainly helps to build the interpreter statically (without --enable-shared). but thanks to confirming my own experience ;) |
Here is a patch for -flto. You need to run autoconf to re-generate configure, too. |
the proposed patch is wrong. when linking with -flto, you should pass all the relevant CFLAGS to the linker as well. Also pass -fuse-linker-plugin. and this should be an opt-in, not the default. Depending on the architecture and the compiler version, -flto is not as stable as you want it to be. and last, this ends up as the default for building third party extensions too, which again, I think should be an opt-in. |
bpo-24915 suggests PGO and comes with an actual patch. I suggest rejecting this ticket as too broad. |
LTO (Link-Time Optimization) is not the same as PGO, though I guess it can take advantage of PGO for its heuristics. |
I would like the see LTO enabled. The intermodule calls to code in abstract.c would become less expensive. |
Note this patch is likely wrong, as it doesn't add the optimization options to the linker invocation. According to the gcc does, """To use the link-time optimizer, -flto and optimization options should be specified at compile time and during the final link""". So probably $OPT should be added to $PY_LDFLAGS. |
For the record, the gain for LTO+PGO (with "-flto -O3") over PGO alone seems to be between 0% and 10% for most benchmarks. |
You can test for yourself by passing |
Hum, does it make sense to enable LTO without PGO? |
Probably not. By the way, I now have a small ARM system to play with, and there the gain of LTO+PGO over PGO alone is around 10%. Also note LTO can make compilation times much longer (it's the linking step actually, which can take minutes). |
On 22.09.2015 12:31, Antoine Pitrou wrote:
use -flto=jobserver |
bpo-24915 is about adding pgo and has a slew of patches. |
Hi, I added a dedicated issue just for LTO only when using GCC and CLANG (http://bugs.python.org/issue25702), that works well with PGO also. |
PGO is available as |
After 2.5 years without response, I think the answer is probably "no" :) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: