New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deeply nested itertools objects segfault #58218
Comments
http://paste.pocoo.org/show/550884/ will reliably segfault Python3 on all platforms (similar versions for Python2 using itertools work) |
The issue is a stack exhaustion. Examples can be trivially made for any iterator that takes another iterator as argument: itertools.takewhile(), zip() in Python3, etc. etc. It's just one of many places where CPython does a recursion without checking the recursion depth. CPython still works, based on the resonable assumption that doing such a recursion here is obscure. Someone seriously bored could start with some C-based callgraph builder; or alternatively use PyPy, which finds such recursions automatically in its own source, and compare all places where a recursion check is inserted with the corresponding place in CPython. There are a large number of them (761, not counting the JIT), so be patient :-( |
Since the paste is dead: i = filter(bool, range(5))
for _ in range(1000000):
i = filter(bool, i)
for p in i:
print(p) |
I'm trying to solve this issue (it seemed easy), but the bug is worse than expected. Python crashed even without iteration at all. it = 'abracadabra'
for _ in range(1000000):
it = filter(bool, it) del it And fixing a recursive deallocator is more harder than iterator. What can we do if a deallocator raises RuntimeError due to maximum recursion depth exceeded. |
Py_TRASHCAN_SAFE_BEGIN/Py_TRASHCAN_SAFE_END macros can help: |
Thank you. Now I understand why this issue not happened with containers. |
Here is a patch which adds recursion limit checks to builtin and itertools recursive iterators. |
LGTM otherwise. |
Oh, I forgot to remove old tests when had moved them to special test class.
Indeed. Thank you. |
New changeset aaaf36026511 by Serhiy Storchaka in branch '3.3': New changeset 846bd418aee5 by Serhiy Storchaka in branch 'default': |
This patch didn't have my sign-off. Applying it was premature. It is a somewhat heavy handed fix that slows all the common cases at the expense of an exotic case. |
New changeset d17d10c84d27 by Serhiy Storchaka in branch '2.7': |
Oh, shame on me. Do I have to immediately revert patches or wait for your post-commit review? |
I would appreciate it if you would please revert this patch. We need to search for a solution that isn't as fine grained (i.e. not doing increments, decrements, and tests on every single call to iter_next). Ideally, the checks can be confined to the iterator constructor and to dealloc. Or you could try to create some general purpose stack overflow protection that periodically makes sure there is enough stack remaining for C Python to function correctly. |
Isn't it exactly what Py_EnterRecursiveCall does? |
It has no notion of how big the C stack is. 2013/4/6 Amaury Forgeot d'Arc <report@bugs.python.org>:
|
No, it isn't. Py_EnterRecursiveCall() counts calls and measures depth. It is sprinked all over the source code, everywhere a potentially recursive call could be made. Instead, it would be nice if the interpreter could monitor the actual stack size and take appropriate actions when it is running low on space. The would save us from putting in expensive fine grained checks throughout the source code. |
New changeset e07e6d828150 by Serhiy Storchaka in branch '2.7': New changeset 7b75f0bd9a5e by Serhiy Storchaka in branch '3.3': New changeset 504eed5a82a3 by Serhiy Storchaka in branch 'default': |
I apologize for my negligence. |
See bpo-14507 for another instance of this in starmap(). |
See bpo-22911 for another instance of this in chain.from_iterable(). |
From Terry Reedy in bpo-22920: |
See bpo-24606 for another instance of this in map(). |
See bpo-30297 for yet one instance of this in starmap(). |
[Armin]
I agree with that assessment and am going to close this as something we can live with (or least have lived with successfully for a long time). AFAICT, this situation doesn't arise in practical code. It is possible to slow down the language by adding a variant of recursion checks to every call to an iterator. But this just makes the language pay a code complexity cost and performance cost for something that doesn't really affect real users. People typically choose itertools for their speed (otherwise, plain generators can be clearer). We shouldn't work against the needs of those users. A robust, clean, and performant solution wouldn't be easy but would likely involve some general purpose stack overflow protection that periodically makes sure there is enough stack remaining for C Python to function correctly. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: