msg292232 - (view) |
Author: Oren Tirosh (orent) |
Date: 2017-04-24 17:43 |
Reproduce:
Py_DEBUG build
PYTHONDUMPREFS=1 ./python -c pass
(big dump of reference information)
Segmentation fault
git-bisected to commit 7822f151b68e40376af657d267ff774439d9adb9
|
msg292234 - (view) |
Author: Oren Tirosh (orent) |
Date: 2017-04-24 18:51 |
In addition to fixing this - perhaps PYTHONDUMPREFS or something similar should be added to test automation? It is apparently capable of uncovering some bugs that none of the other reference and recnt debugging tools could find.
|
msg292235 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-04-24 18:55 |
The regression was added by the fix for issue26811. PR 1272 applies the alternate patch from issue26811. This doesn't harm the performance.
$ ./python.patched -m perf timeit -q --compare-to ./python.default -s "from collections import namedtuple; P = namedtuple('P', 'x y'); p = P(1, 2)" --duplicate 1000 "p.x"
Mean +- std dev: [python.default] 128 ns +- 7 ns -> [python.patched] 121 ns +- 7 ns: 1.05x faster (-5%)
I thought about tests, but I don't know what is the best place for them. Seems other environment variables that controls debug output are not tested too.
|
msg292239 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-04-24 20:59 |
Come on, yet another crash from property_descr_get()??? It's the 3rd time... Do we really need this micro-optimization?
Previous bugs and workarounds:
- issue #26811
- issue #24276
Using the FASTCALL calling convention, no temporary tuple is created to pass arguments if you use the _PyObject_FastCall() API and if the called function supports this calling convention.
Sadly, namedtuple.attr uses operator.itergetter, itemgetter_call() doesn't support the FASTCALL, and my tp_fastcall patch was rejected: issue #29259.
|
msg292252 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-04-25 06:16 |
Removing this micro-optimization makes attribute access in namedtuple more than 1.5 times slower:
Mean +- std dev: [python.default] 126 ns +- 4 ns -> [python] 200 ns +- 7 ns: 1.58x slower (+58%)
This would be a significant regression.
|
msg292253 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-04-25 06:29 |
The previous result was got when use _PyObject_FastCallDict(). Using PyObject_Call() is slightly faster:
Mean +- std dev: [python.default] 127 ns +- 4 ns -> [python] 185 ns +- 9 ns: 1.45x slower (+45%)
|
msg304342 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-10-13 15:08 |
> Removing this micro-optimization makes attribute access in namedtuple more than 1.5 times slower:
> Mean +- std dev: [python.default] 126 ns +- 4 ns -> [python] 200 ns +- 7 ns: 1.58x slower (+58%)
I wrote the PR 3985, it's only 20 ns slower (1.3x slower):
[ref] 80.4 ns +- 3.3 ns -> [fastcall] 103 ns +- 5 ns: 1.28x slower (+28%)
Maybe Python was optimized further in the meanwhile, or the slowdown is higher on your computer?
|
msg304343 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-10-13 15:14 |
Note: if you care of namedtuple performance, Raymond Hettinger wrote that he would be interested to reuse the C structseq sequence. I measured that getting an attribute by name is faster in structseq than with the current property cached tuple hack.
|
msg304566 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-10-18 08:46 |
Maybe the difference is processor depending. I'm going to repeat benchmarking on 32-bit processors. Or maybe fast call was optimized in master. Or passing via an arguments tuple was pessimized.
|
msg304711 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-10-21 16:15 |
Result on the same computer:
Mean +- std dev: [python0] 146 ns +- 5 ns -> [python] 206 ns +- 2 ns: 1.41x slower (+41%)
The difference now is smaller (41% instead of 58%) because calling with a tuple now is 16% slower.
|
msg326257 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-09-24 15:52 |
PR 9541: my new attempt to remove the micro-optimization. Commit message:
bpo-30156: Remove property_descr_get() optimization
property_descr_get() uses a "cached" tuple to optimize function
calls. But this tuple can be discovered in debug mode with
sys.getobjects(). Remove the optimization, it's not really worth it
and it causes 3 different crashes last years.
Microbenchmark:
./python -m perf timeit -v \
-s "from collections import namedtuple; P = namedtuple('P', 'x y'); p = P(1, 2)" \
--duplicate 1024 "p.x"
Result:
Mean +- std dev: [ref] 32.8 ns +- 0.8 ns -> [patch] 40.4 ns +- 1.3 ns: 1.23x slower (+23%)
|
msg326258 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-09-24 15:54 |
Is this optiization really worth it? 7 nanoseconds faster, but leak a weird tuple which caused 3 different crashes...
|
msg326436 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-09-26 09:11 |
Note: I reported a similar issue which has then been marked as a duplicate of this one: bpo-34223. Copy of my first message:
---
It seems like Python has an invalid object somewhere. PYTHONDUMPREFS=1 makes Python crash at exit.
$ PYTHONDUMPREFS=1 ./python -c pass
(...)
0x7f1992292448 [1] (<class '_thread._localdummy'>, <class 'object'>)
0x7f1992241aa8 [1] {'__doc__': 'Thread-local dummy'}
0x7f199222c740 [1] (<class 'object'>,)
0x9c98a0 [2] <class '_thread._localdummy'>
0x7f1992217dc0 [1]
Segmentation fault (core dumped)
---
|
msg326779 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-10-01 10:03 |
New changeset e972c13624c32d0efdceb08ff83917fb6b488525 by Victor Stinner in branch 'master':
bpo-30156: Remove property_descr_get() optimization (GH-9541)
https://github.com/python/cpython/commit/e972c13624c32d0efdceb08ff83917fb6b488525
|
msg326782 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-10-01 10:09 |
I proposeto leave Python 3.6 and 3.7 unchanged. I don't think that this specific bug matters enough to remove to remove an optimization from these stable branches. Since the optimization has been added (2015, bpo-23910), I'm only aware of two bug reports on PYTHONDUMPREFS (this one, and my duplicate).
If someone considers that the optimization must die in 3.6 and 3.7, please reopen the issue (or just add a comment).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:45 | admin | set | github: 74342 |
2018-11-13 10:42:51 | pablogsal | set | pull_requests:
+ pull_request9768 |
2018-10-01 10:09:45 | vstinner | set | status: open -> closed versions:
+ Python 3.8, - Python 3.5 messages:
+ msg326782
resolution: fixed stage: patch review -> resolved |
2018-10-01 10:03:27 | vstinner | set | messages:
+ msg326779 |
2018-09-26 09:11:00 | vstinner | set | messages:
+ msg326436 |
2018-09-24 15:54:05 | vstinner | set | messages:
+ msg326258 |
2018-09-24 15:52:54 | vstinner | set | messages:
+ msg326257 |
2018-09-24 15:49:08 | vstinner | set | pull_requests:
+ pull_request8943 |
2018-09-24 15:37:50 | vstinner | link | issue34223 superseder |
2017-10-21 16:15:08 | serhiy.storchaka | set | messages:
+ msg304711 |
2017-10-18 08:46:35 | serhiy.storchaka | set | messages:
+ msg304566 |
2017-10-13 15:14:13 | vstinner | set | messages:
+ msg304343 |
2017-10-13 15:08:41 | vstinner | set | messages:
+ msg304342 |
2017-10-13 15:05:55 | vstinner | set | keywords:
+ patch pull_requests:
+ pull_request3961 |
2017-04-25 06:29:03 | serhiy.storchaka | set | messages:
+ msg292253 |
2017-04-25 06:16:21 | serhiy.storchaka | set | nosy:
+ rhettinger messages:
+ msg292252
|
2017-04-24 20:59:20 | vstinner | set | messages:
+ msg292239 |
2017-04-24 18:55:59 | serhiy.storchaka | set | versions:
+ Python 3.5, Python 3.7 nosy:
+ vstinner
messages:
+ msg292235
stage: patch review |
2017-04-24 18:51:52 | orent | set | messages:
+ msg292234 |
2017-04-24 18:46:48 | serhiy.storchaka | set | pull_requests:
+ pull_request1383 |
2017-04-24 17:43:47 | orent | create | |