Message 277149 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	brett.cannon, fijall, ned.deily, pitrou, serhiy.storchaka, steven.daprano, tim.peters, vstinner, yselivanov
Date	2016-09-21.14:41:42
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1474468902.2.0.389158019578.issue28240@psf.upfronthosting.co.za>
In-reply-to

Content
> Another point: timeit is often used to compare performance between Python versions. By changing the behaviour of timeit in a given Python version, you'll make it more difficult to compare results. Hum, that's a good argument against my change :-) So to be able to compare Python 3.5 vs 3.6 or Python 2.7 vs Python 3.6, we need to backport somehow the average feature to the timeit module of older Python versions. One option would be to put the timeit module on the Python Cheeseshop (PyPI). Hum, but there is already such module: my perf module. A solution would be to redirect users to the perf module in the timeit documentation, and maybe also document that timeit results are not reliable? A different solution would be to add a --python parameter to timeit to run the benchmark on a specific Python version (ex: "python3 -m timeit --python=python2 ..."). But this solution is more complex to be developed since we have to make the timeit.py compatible with Python 2.7 and find a reliable way to load it in the other tested Python program. Note: I plan to add a --python parameter in my perf module, but I didn't implemented yet. Since my perf module spawn child processes and the perf module is a third party module, it is simpler to implement this option. -- A more general remark: the timeit is commonly used to compare performances of two Python versions. They run timeit twice and then compare manually results. But only two numbers are compared. It would be more reliable to compare all timings and make sure that the comparison is significant. Again, the perf module implements such function: http://perf.readthedocs.io/en/latest/api.html#perf.is_significant I didn't implement a full CLI for perf timeit to directly compares two Python versions. You have to run timeit twice and store all timings in JSON files and then use the "perf compare" command to reload timings and compare them.

> Another point: timeit is often used to compare performance between Python versions. By changing the behaviour of timeit in a given Python version, you'll make it more difficult to compare results.

Hum, that's a good argument against my change :-)

So to be able to compare Python 3.5 vs 3.6 or Python 2.7 vs Python 3.6, we need to backport somehow the average feature to the timeit module of older Python versions. One option would be to put the timeit module on the Python Cheeseshop (PyPI). Hum, but there is already such module: my perf module.

A solution would be to redirect users to the perf module in the timeit documentation, and maybe also document that timeit results are not reliable?

A different solution would be to add a --python parameter to timeit to run the benchmark on a specific Python version (ex: "python3 -m timeit --python=python2 ..."). But this solution is more complex to be developed since we have to make the timeit.py compatible with Python 2.7 and find a reliable way to load it in the other tested Python program.

Note: I plan to add a --python parameter in my perf module, but I didn't implemented yet. Since my perf module spawn child processes and the perf module is a third party module, it is simpler to implement this option.

A more general remark: the timeit is commonly used to compare performances of two Python versions. They run timeit twice and then compare manually results. But only two numbers are compared. It would be more reliable to compare all timings and make sure that the comparison is significant. Again, the perf module implements such function:
http://perf.readthedocs.io/en/latest/api.html#perf.is_significant

I didn't implement a full CLI for perf timeit to directly compares two Python versions. You have to run timeit twice and store all timings in JSON files and then use the "perf compare" command to reload timings and compare them.

History
Date	User	Action	Args
2016-09-21 14:41:42	vstinner	set	recipients: + vstinner, tim.peters, brett.cannon, pitrou, ned.deily, steven.daprano, fijall, serhiy.storchaka, yselivanov
2016-09-21 14:41:42	vstinner	set	messageid: <1474468902.2.0.389158019578.issue28240@psf.upfronthosting.co.za>
2016-09-21 14:41:42	vstinner	link	issue28240 messages
2016-09-21 14:41:42	vstinner	create