This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients serhiy.storchaka, vstinner, yselivanov
Date 2016-08-08.22:01:14
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1470693675.67.0.863843301106.issue27128@psf.upfronthosting.co.za>
In-reply-to
Content
I spent the last 3 months on making the CPython benchmark suite more stable and enhance my procedure to run benchmarks to ensure that benchmarks are more stable.

See my articles:
https://haypo-notes.readthedocs.io/microbenchmark.html#my-articles

I forked and enhanced the benchmark suite to use my perf module to run benchmarks in multiple processes:
https://hg.python.org/sandbox/benchmarks_perf

I ran this better benchmark suite on fastcall-2.patch on my laptop. The result is quite good: 
----------------
$ python3 -m perf compare_to ref.json fastcall.json -G  --min-speed=5
Slower (4):
- fastpickle/pickle_dict: 326 us +- 15 us -> 350 us +- 29 us: 1.07x slower
- regex_effbot: 49.4 ms +- 1.3 ms -> 53.0 ms +- 1.2 ms: 1.07x slower
- fastpickle/pickle: 432 us +- 8 us -> 457 us +- 10 us: 1.06x slower
- pybench.ComplexPythonFunctionCalls: 838 ns +- 11 ns -> 884 ns +- 8 ns: 1.05x slower

Faster (13):
- spectral_norm: 289 ms +- 6 ms -> 250 ms +- 5 ms: 1.16x faster
- pybench.SimpleIntFloatArithmetic: 622 ns +- 9 ns -> 559 ns +- 10 ns: 1.11x faster
- pybench.SimpleIntegerArithmetic: 621 ns +- 10 ns -> 560 ns +- 9 ns: 1.11x faster
- pybench.SimpleLongArithmetic: 891 ns +- 12 ns -> 816 ns +- 10 ns: 1.09x faster
- pybench.DictCreation: 852 ns +- 13 ns -> 788 ns +- 16 ns: 1.08x faster
- pybench.ForLoops: 10.8 ns +- 0.3 ns -> 9.99 ns +- 0.23 ns: 1.08x faster
- pybench.NormalClassAttribute: 1.85 us +- 0.02 us -> 1.72 us +- 0.04 us: 1.08x faster
- pybench.SpecialClassAttribute: 1.86 us +- 0.02 us -> 1.73 us +- 0.03 us: 1.07x faster
- pybench.NestedForLoops: 21.9 ns +- 0.3 ns -> 20.7 ns +- 0.3 ns: 1.05x faster
- pybench.SimpleListManipulation: 501 ns +- 4 ns -> 476 ns +- 5 ns: 1.05x faster
- elementtree/process: 192 ms +- 3 ms -> 183 ms +- 2 ms: 1.05x faster
- elementtree/generate: 225 ms +- 5 ms -> 214 ms +- 4 ms: 1.05x faster
- hexiom2/level_25: 21.3 ms +- 0.3 ms -> 20.3 ms +- 0.1 ms: 1.05x faster

Benchmark hidden because not significant (84): (...)
----------------

Most benchmarks are not significant which is expected since fastcall-2.patch is really the most simple patch to start the work on "FASTCALL", it doesn't really implement any optimization, it only adds a new infrastructure to implement new optimizations.

A few benchmarks are faster (only benchmarks at least 5% faster are shown using --min-speed=5).

4 benchmarks are slower, but the slowdown should be temporarily: new optimizations should these benchmarks slower. See the issue #26814 for more a concrete implementation and a lot of benchmark results if you don't trust me :-)

I consider that benchmarks proved that there is no major slowdown, so fastcall-2.patch can be merged to be able to start working on real optimizations.
History
Date User Action Args
2016-08-08 22:01:15vstinnersetrecipients: + vstinner, serhiy.storchaka, yselivanov
2016-08-08 22:01:15vstinnersetmessageid: <1470693675.67.0.863843301106.issue27128@psf.upfronthosting.co.za>
2016-08-08 22:01:15vstinnerlinkissue27128 messages
2016-08-08 22:01:14vstinnercreate