Message 273922 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	brett.cannon, pitrou, skrah, vstinner, yselivanov
Date	2016-08-30.16:12:28
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1472573548.88.0.683919160071.issue26284@psf.upfronthosting.co.za>
In-reply-to

Content
I worked on a new version of the benchmark suite, it is now called performance and moved to GitHub: https://github.com/python/performance In performance 0.1.2, bm_telco.py uses BytesIO for input and six.StringIO for output. The output is just one number per line, it's different than http://bugs.python.org/file41802/telco_haypo.py output. I don't know which benchmark is "right". I believe that bm_telco.py of performance is now much more stable thanks to the perf module which runs the benchmark in multiple processes (20 by default). It helps to get a better distribution of samples. Example: $ python3 -m performance run -b telco -o telco.json (...) $ python3 -m perf show --hist --stats --metadata telco.json Metadata: - aslr: Full randomization - cpu_config: 0-7=driver:intel_pstate, intel_pstate:no turbo, governor:powersave - cpu_count: 8 - cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz - description: Telco decimal benchmark - hostname: smithers - loops: 8 - name: telco - perf_version: 0.7.4 - platform: Linux-4.5.7-300.fc24.x86_64-x86_64-with-fedora-24-Twenty_Four - python_executable: /home/haypo/prog/python/performance/venv/cpython3.5-da07a9f70715/bin/python - python_implementation: cpython - python_version: 3.5.1 (64bit) - timer: clock_gettime(CLOCK_MONOTONIC), resolution: 1.00 ns 22.7 ms: 18 ##################################################### 23.0 ms: 27 ############################################################################### 23.4 ms: 7 #################### 23.7 ms: 1 ### 24.1 ms: 2 ###### 24.4 ms: 1 ### 24.7 ms: 0 \| 25.1 ms: 1 ### 25.4 ms: 0 \| 25.8 ms: 0 \| 26.1 ms: 0 \| 26.5 ms: 0 \| 26.8 ms: 0 \| 27.1 ms: 0 \| 27.5 ms: 0 \| 27.8 ms: 0 \| 28.2 ms: 0 \| 28.5 ms: 0 \| 28.9 ms: 1 ### 29.2 ms: 1 ### 29.5 ms: 1 ### Total duration: 15.1 sec Start date: 2016-08-30T18:09:49 End date: 2016-08-30T18:10:07 Raw sample minimum: 182 ms Raw sample maximum: 237 ms Number of runs: 20 Total number of samples: 60 Number of samples per run: 3 Number of warmups per run: 1 Loop iterations per sample: 8 Minimum: 22.8 ms (-2%) Median +- std dev: 23.2 ms +- 1.4 ms Mean +- std dev: 23.6 ms +- 1.4 ms Maximum: 29.7 ms (+28%) Median +- std dev: 23.2 ms +- 1.4 ms The histogram helps to see that there is no such "minimum", but more a gaussian curve. The perf module uses the median of samples rather than the minimum, it also displays and computes the standard deviation by default.

I worked on a new version of the benchmark suite, it is now called performance and moved to GitHub:
https://github.com/python/performance

In performance 0.1.2, bm_telco.py uses BytesIO for input and six.StringIO for output. The output is just one number per line, it's different than  http://bugs.python.org/file41802/telco_haypo.py output.

I don't know which benchmark is "right".

I believe that bm_telco.py of performance is now much more stable thanks to the perf module which runs the benchmark in multiple processes (20 by default). It helps to get a better distribution of samples.

Example:

$ python3 -m performance run -b telco -o telco.json
(...)

$ python3 -m perf show --hist --stats --metadata telco.json 
Metadata:
- aslr: Full randomization
- cpu_config: 0-7=driver:intel_pstate, intel_pstate:no turbo, governor:powersave
- cpu_count: 8
- cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
- description: Telco decimal benchmark
- hostname: smithers
- loops: 8
- name: telco
- perf_version: 0.7.4
- platform: Linux-4.5.7-300.fc24.x86_64-x86_64-with-fedora-24-Twenty_Four
- python_executable: /home/haypo/prog/python/performance/venv/cpython3.5-da07a9f70715/bin/python
- python_implementation: cpython
- python_version: 3.5.1 (64bit)
- timer: clock_gettime(CLOCK_MONOTONIC), resolution: 1.00 ns

22.7 ms: 18 #####################################################
23.0 ms: 27 ###############################################################################
23.4 ms:  7 ####################
23.7 ms:  1 ###
24.1 ms:  2 ######
24.4 ms:  1 ###
24.7 ms:  0 |
25.1 ms:  1 ###
25.4 ms:  0 |
25.8 ms:  0 |
26.1 ms:  0 |
26.5 ms:  0 |
26.8 ms:  0 |
27.1 ms:  0 |
27.5 ms:  0 |
27.8 ms:  0 |
28.2 ms:  0 |
28.5 ms:  0 |
28.9 ms:  1 ###
29.2 ms:  1 ###
29.5 ms:  1 ###

Total duration: 15.1 sec
Start date: 2016-08-30T18:09:49
End date: 2016-08-30T18:10:07
Raw sample minimum: 182 ms
Raw sample maximum: 237 ms

Number of runs: 20
Total number of samples: 60
Number of samples per run: 3
Number of warmups per run: 1
Loop iterations per sample: 8

Minimum: 22.8 ms (-2%)
Median +- std dev: 23.2 ms +- 1.4 ms
Mean +- std dev: 23.6 ms +- 1.4 ms
Maximum: 29.7 ms (+28%)

Median +- std dev: 23.2 ms +- 1.4 ms


The histogram helps to see that there is no such "minimum", but more a gaussian curve. The perf module uses the median of samples rather than the minimum, it also displays and computes the standard deviation by default.

History
Date	User	Action	Args
2016-08-30 16:12:28	vstinner	set	recipients: + vstinner, brett.cannon, pitrou, skrah, yselivanov
2016-08-30 16:12:28	vstinner	set	messageid: <1472573548.88.0.683919160071.issue26284@psf.upfronthosting.co.za>
2016-08-30 16:12:28	vstinner	link	issue26284 messages
2016-08-30 16:12:28	vstinner	create