New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PYTHONDUMPREFS segfaults on exit #74342
Comments
Reproduce: Py_DEBUG build git-bisected to commit 7822f15 |
In addition to fixing this - perhaps PYTHONDUMPREFS or something similar should be added to test automation? It is apparently capable of uncovering some bugs that none of the other reference and recnt debugging tools could find. |
The regression was added by the fix for bpo-26811. PR 1272 applies the alternate patch from bpo-26811. This doesn't harm the performance. $ ./python.patched -m perf timeit -q --compare-to ./python.default -s "from collections import namedtuple; P = namedtuple('P', 'x y'); p = P(1, 2)" --duplicate 1000 "p.x"
Mean +- std dev: [python.default] 128 ns +- 7 ns -> [python.patched] 121 ns +- 7 ns: 1.05x faster (-5%) I thought about tests, but I don't know what is the best place for them. Seems other environment variables that controls debug output are not tested too. |
Come on, yet another crash from property_descr_get()??? It's the 3rd time... Do we really need this micro-optimization? Previous bugs and workarounds: Using the FASTCALL calling convention, no temporary tuple is created to pass arguments if you use the _PyObject_FastCall() API and if the called function supports this calling convention. Sadly, namedtuple.attr uses operator.itergetter, itemgetter_call() doesn't support the FASTCALL, and my tp_fastcall patch was rejected: issue bpo-29259. |
Removing this micro-optimization makes attribute access in namedtuple more than 1.5 times slower: Mean +- std dev: [python.default] 126 ns +- 4 ns -> [python] 200 ns +- 7 ns: 1.58x slower (+58%) This would be a significant regression. |
The previous result was got when use _PyObject_FastCallDict(). Using PyObject_Call() is slightly faster: Mean +- std dev: [python.default] 127 ns +- 4 ns -> [python] 185 ns +- 9 ns: 1.45x slower (+45%) |
I wrote the PR 3985, it's only 20 ns slower (1.3x slower): [ref] 80.4 ns +- 3.3 ns -> [fastcall] 103 ns +- 5 ns: 1.28x slower (+28%) Maybe Python was optimized further in the meanwhile, or the slowdown is higher on your computer? |
Note: if you care of namedtuple performance, Raymond Hettinger wrote that he would be interested to reuse the C structseq sequence. I measured that getting an attribute by name is faster in structseq than with the current property cached tuple hack. |
Maybe the difference is processor depending. I'm going to repeat benchmarking on 32-bit processors. Or maybe fast call was optimized in master. Or passing via an arguments tuple was pessimized. |
Result on the same computer: Mean +- std dev: [python0] 146 ns +- 5 ns -> [python] 206 ns +- 2 ns: 1.41x slower (+41%) The difference now is smaller (41% instead of 58%) because calling with a tuple now is 16% slower. |
PR 9541: my new attempt to remove the micro-optimization. Commit message: bpo-30156: Remove property_descr_get() optimization property_descr_get() uses a "cached" tuple to optimize function Microbenchmark: ./python -m perf timeit -v \ Result: Mean +- std dev: [ref] 32.8 ns +- 0.8 ns -> [patch] 40.4 ns +- 1.3 ns: 1.23x slower (+23%) |
Is this optiization really worth it? 7 nanoseconds faster, but leak a weird tuple which caused 3 different crashes... |
Note: I reported a similar issue which has then been marked as a duplicate of this one: bpo-34223. Copy of my first message: $ PYTHONDUMPREFS=1 ./python -c pass
(...)
0x7f1992292448 [1] (<class '_thread._localdummy'>, <class 'object'>)
0x7f1992241aa8 [1] {'__doc__': 'Thread-local dummy'}
0x7f199222c740 [1] (<class 'object'>,)
0x9c98a0 [2] <class '_thread._localdummy'>
0x7f1992217dc0 [1]
Segmentation fault (core dumped) |
I proposeto leave Python 3.6 and 3.7 unchanged. I don't think that this specific bug matters enough to remove to remove an optimization from these stable branches. Since the optimization has been added (2015, bpo-23910), I'm only aware of two bug reports on PYTHONDUMPREFS (this one, and my duplicate). If someone considers that the optimization must die in 3.6 and 3.7, please reopen the issue (or just add a comment). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: