Message 328361 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	methane, serhiy.storchaka, vstinner
Date	2018-10-24.11:16:34
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1540379795.09.0.788709270274.issue35053@psf.upfronthosting.co.za>
In-reply-to

Content
> Is performance overhead negligible? Thank you for asking the most important question :-) I ran this microbenchmark: make distclean ./configure --enable-lto make ./python -m venv env env/bin/python -m pip install perf sudo env/bin/python -m perf system tune env/bin/python -m perf timeit -o FILE.json -v '[]' My first attempt: $ env/bin/python -m perf compare_to ref.json patch.json Mean +- std dev: [ref] 20.6 ns +- 0.1 ns -> [patch] 22.4 ns +- 0.1 ns: 1.09x slower (+9%) The addition of the _PyTraceMalloc_NewReference() call which does nothing (check tracing flag, return) adds 1.7 ns: it's not negligible on such micro-benchmark, and I would prefer to avoid it whenever possible since _Py_NewReference() is the root of the free list optimization. New attempt: expose tracemalloc_config and add _Py_unlikely() macro (instruct the compiler that tracing is false most of the time): Mean +- std dev: [ref] 20.6 ns +- 0.1 ns -> [unlikely] 20.4 ns +- 0.3 ns: 1.01x faster (-1%) Good! The overhead is now negligible! But... is the hardcore low-level _Py_unlikely() optimization really needed?... $ env/bin/python -m perf compare_to ref.json if_tracing.json Benchmark hidden because not significant (1): timeit => no, the macro is useless, so I removed it! New benchmark to double-check on my laptop. git checkout master make clean; make rm -rf env; ./python -m venv env; env/bin/python -m pip install perf sudo env/bin/python -m perf system tune env/bin/python -m perf timeit -o ref.json -v '[]' --rigorous git checkout tracemalloc_newref make clean; make rm -rf env; ./python -m venv env; env/bin/python -m pip install perf env/bin/python -m perf timeit -o patch.json -v '[]' --rigorous $ env/bin/python -m perf compare_to ref.json patch.json Mean +- std dev: [ref] 20.8 ns +- 0.7 ns -> [patch] 20.5 ns +- 0.3 ns: 1.01x faster (-1%) The std dev is a little bit high. I didn't use CPU isolation and Hexchat + Firefox was running in the background, but it seems like the mean is very close, and so that my PR has no significant overhead.

> Is performance overhead negligible?

Thank you for asking the most important question :-)

I ran this microbenchmark:

make distclean
./configure --enable-lto
make
./python -m venv env
env/bin/python -m pip install perf
sudo env/bin/python -m perf system tune
env/bin/python -m perf timeit -o FILE.json -v '[]'


My first attempt:

$ env/bin/python -m perf compare_to ref.json patch.json 
Mean +- std dev: [ref] 20.6 ns +- 0.1 ns -> [patch] 22.4 ns +- 0.1 ns: 1.09x slower (+9%)

The addition of the _PyTraceMalloc_NewReference() call which does nothing (check tracing flag, return) adds 1.7 ns: it's not negligible on such micro-benchmark, and I would prefer to avoid it whenever possible since _Py_NewReference() is the root of the free list optimization.


New attempt: expose tracemalloc_config and add _Py_unlikely() macro (instruct the compiler that tracing is false most of the time):

Mean +- std dev: [ref] 20.6 ns +- 0.1 ns -> [unlikely] 20.4 ns +- 0.3 ns: 1.01x faster (-1%)

Good! The overhead is now negligible!


But... is the hardcore low-level _Py_unlikely() optimization really needed?...

$ env/bin/python -m perf compare_to ref.json if_tracing.json 
Benchmark hidden because not significant (1): timeit

=> no, the macro is useless, so I removed it!



New benchmark to double-check on my laptop.

git checkout master
make clean; make
rm -rf env; ./python -m venv env; env/bin/python -m pip install perf
sudo env/bin/python -m perf system tune
env/bin/python -m perf timeit -o ref.json -v '[]' --rigorous

git checkout tracemalloc_newref
make clean; make
rm -rf env; ./python -m venv env; env/bin/python -m pip install perf
env/bin/python -m perf timeit -o patch.json -v '[]' --rigorous

$ env/bin/python -m perf compare_to ref.json patch.json 
Mean +- std dev: [ref] 20.8 ns +- 0.7 ns -> [patch] 20.5 ns +- 0.3 ns: 1.01x faster (-1%)


The std dev is a little bit high. I didn't use CPU isolation and Hexchat + Firefox was running in the background, *but* it seems like the mean is very close, and so that my PR has no significant overhead.

History
Date	User	Action	Args
2018-10-24 11:16:35	vstinner	set	recipients: + vstinner, methane, serhiy.storchaka
2018-10-24 11:16:35	vstinner	set	messageid: <1540379795.09.0.788709270274.issue35053@psf.upfronthosting.co.za>
2018-10-24 11:16:35	vstinner	link	issue35053 messages
2018-10-24 11:16:34	vstinner	create