classification
Title: Fix telco benchmark
Type: behavior Stage: needs patch
Components: Benchmarks Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: vstinner Nosy List: brett.cannon, pitrou, skrah, vstinner, yselivanov
Priority: normal Keywords:

Created on 2016-02-04 14:28 by skrah, last changed 2016-09-13 09:05 by vstinner. This issue is now closed.

Messages (8)
msg259569 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-02-04 14:28
The telco benchmark is unstable. It needs some of Victor's changes from #26275 and probably a larger data set:

http://speleotrove.com/decimal/expon180-1e6b.zip is too big for
_pydecimal, but the one that is used is probably too small for
_decimal.
msg259789 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-02-07 14:08
Unfortunately, replacing io.BytesIO(data) with indexing does not make the benchmark faster or more stable on my machine.

BTW, string conversion of the result is actually a crucial part of
the benchmark, it was taken out in http://bugs.python.org/file41802/telco_haypo.py.
msg260743 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-02-23 20:56
If you think the string conversion should go back in, Stefan, feel free to put it back (unless Victor wants to say why he took it out).
msg260748 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-02-23 22:23
> Unfortunately, replacing io.BytesIO(data) with indexing does not make the benchmark faster or more stable on my machine.

Ah, I didn't check. I expected BytesIO.read() to be slower than bytes string slicing.
msg273922 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-30 16:12
I worked on a new version of the benchmark suite, it is now called performance and moved to GitHub:
https://github.com/python/performance

In performance 0.1.2, bm_telco.py uses BytesIO for input and six.StringIO for output. The output is just one number per line, it's different than  http://bugs.python.org/file41802/telco_haypo.py output.

I don't know which benchmark is "right".

I believe that bm_telco.py of performance is now much more stable thanks to the perf module which runs the benchmark in multiple processes (20 by default). It helps to get a better distribution of samples.

Example:

$ python3 -m performance run -b telco -o telco.json
(...)

$ python3 -m perf show --hist --stats --metadata telco.json 
Metadata:
- aslr: Full randomization
- cpu_config: 0-7=driver:intel_pstate, intel_pstate:no turbo, governor:powersave
- cpu_count: 8
- cpu_model_name: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
- description: Telco decimal benchmark
- hostname: smithers
- loops: 8
- name: telco
- perf_version: 0.7.4
- platform: Linux-4.5.7-300.fc24.x86_64-x86_64-with-fedora-24-Twenty_Four
- python_executable: /home/haypo/prog/python/performance/venv/cpython3.5-da07a9f70715/bin/python
- python_implementation: cpython
- python_version: 3.5.1 (64bit)
- timer: clock_gettime(CLOCK_MONOTONIC), resolution: 1.00 ns

22.7 ms: 18 #####################################################
23.0 ms: 27 ###############################################################################
23.4 ms:  7 ####################
23.7 ms:  1 ###
24.1 ms:  2 ######
24.4 ms:  1 ###
24.7 ms:  0 |
25.1 ms:  1 ###
25.4 ms:  0 |
25.8 ms:  0 |
26.1 ms:  0 |
26.5 ms:  0 |
26.8 ms:  0 |
27.1 ms:  0 |
27.5 ms:  0 |
27.8 ms:  0 |
28.2 ms:  0 |
28.5 ms:  0 |
28.9 ms:  1 ###
29.2 ms:  1 ###
29.5 ms:  1 ###

Total duration: 15.1 sec
Start date: 2016-08-30T18:09:49
End date: 2016-08-30T18:10:07
Raw sample minimum: 182 ms
Raw sample maximum: 237 ms

Number of runs: 20
Total number of samples: 60
Number of samples per run: 3
Number of warmups per run: 1
Loop iterations per sample: 8

Minimum: 22.8 ms (-2%)
Median +- std dev: 23.2 ms +- 1.4 ms
Mean +- std dev: 23.6 ms +- 1.4 ms
Maximum: 29.7 ms (+28%)

Median +- std dev: 23.2 ms +- 1.4 ms


The histogram helps to see that there is no such "minimum", but more a gaussian curve. The perf module uses the median of samples rather than the minimum, it also displays and computes the standard deviation by default.
msg273925 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-30 16:20
@Stefan: Can you please check bm_telco.py and maybe propose a pull request if something is wrong?
msg273976 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-08-30 22:19
Wow, on my machine this is very stable, great.

The output should be like

   http://www.bytereef.org/software/mpdecimal/benchmarks/telco.py ,

but printing one number only should be okay. The important thing is that some decimal is printed at all to test the formatting speed.


So LGTM.
msg276218 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-13 09:05
Sorry, I'm not sure if you want to make any change to bm_telco.py of the performance. If yes, please open an issue at https://github.com/python/performance

> So LGTM.

Hum, it looks like you are happy, so I now close this issue :-) I'm slowly trying to move discussions related to benchmarking to the performance module on GitHub.
History
Date User Action Args
2016-09-13 09:05:39vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg276218
2016-08-30 22:19:35skrahsetassignee: skrah -> vstinner
messages: + msg273976
2016-08-30 16:20:31vstinnersetmessages: + msg273925
2016-08-30 16:13:19vstinnerunlinkissue26275 dependencies
2016-08-30 16:12:28vstinnersetmessages: + msg273922
2016-02-23 22:23:24vstinnersetmessages: + msg260748
2016-02-23 20:56:26brett.cannonsetassignee: skrah
messages: + msg260743
2016-02-23 09:58:09skrahlinkissue26275 dependencies
2016-02-07 14:10:43skrahsettitle: FIx telco benchmark -> Fix telco benchmark
2016-02-07 14:08:38skrahsetmessages: + msg259789
2016-02-04 14:28:01skrahcreate