Message 346723 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	int19h
Recipients	aldwinaldwin, fabioz, int19h
Date	2019-06-27.10:06:31
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1561629992.85.0.963801625789.issue37416@roundup.psfhosted.org>
In-reply-to

Content
This is a bit tricky to explain... There's no easy way to achieve this effect "normally". It manifests due to the way some Python debuggers (specifically, pydevd and ptvsd - as used by PyCharm, PyDev, and VSCode) implement non-cooperative attaching to a running Python process by PID. A TL;DR take is that those debuggers have to inject a new thread into a running process from the outside, and then run some Python code on that thread. There are OS APIs for such thread injection - e.g. CreateRemoteThread on Windows. There are various tricks that they then have to use to safely acquire GIL and invoke PyEval_InitThreads, but ultimately it comes down to running Python code. That is the point where this can manifest. Basically, as soon as that injected code (i.e. the actual debugger) imports threading, things break. And advanced debuggers do need background threads for some functionality... Here are the technical details - i.e. how thread injection happens exactly, and what kind of code it might run - if you're interested. https://github.com/microsoft/ptvsd/issues/1542 I think that a similar problem can also occur in an embedded Python scenario with multithreading. Consider what happens if the hosted interpreter is initialized from the main thread of the host app - but some Python code is then run from the background thread, and that code happens to be the first in the process to import threading. Then that background thread becomes the "main thread" for threading, with the same results as described above. The high-level problem, I think, is that there's an inconsistency between what Python itself considers "main thread" (i.e. main_thread in ceval.c, as set by PyEval_InitThreads), and what threading module considers "main thread" (i.e. _main_thread in threading.py). Logically, these should be in sync. If PyEval_InitThreads is the source of truth, then the existing thread injection technique will "just work" as implemented already in all the aforementioned debuggers. It makes sure to invoke PyEval_InitThreads via Py_AddPendingCall, rather than directly from the background thread, precisely so that the interpreter doesn't get confused. Furthermore, on 3.7+, PyEval_InitThreads is already automatically invoked by Py_Initialize, and hence when used by python.exe, will mark the actual first thread in the process as the main thread. So, using it a the source of truth would guarantee that attach by thread injection works correctly in all non-embedded Python scenarios. Apps hosting Python would still need to ensure that they always call Py_Initialize on what they want to be the main thread, as they already have to do; but they wouldn't need to worry about "import threading" anymore.

This is a bit tricky to explain... There's no easy way to achieve this effect "normally". It manifests due to the way some Python debuggers (specifically, pydevd and ptvsd - as used by PyCharm, PyDev, and VSCode) implement non-cooperative attaching to a running Python process by PID.

A TL;DR take is that those debuggers have to inject a new thread into a running process from the outside, and then run some Python code on that thread. There are OS APIs for such thread injection - e.g. CreateRemoteThread on Windows. There are various tricks that they then have to use to safely acquire GIL and invoke PyEval_InitThreads, but ultimately it comes down to running Python code.

That is the point where this can manifest. Basically, as soon as that injected code (i.e. the actual debugger) imports threading, things break. And advanced debuggers do need background threads for some functionality...

Here are the technical details - i.e. how thread injection happens exactly, and what kind of code it might run - if you're interested.
https://github.com/microsoft/ptvsd/issues/1542

I think that a similar problem can also occur in an embedded Python scenario with multithreading. Consider what happens if the hosted interpreter is initialized from the main thread of the host app - but some Python code is then run from the background thread, and that code happens to be the first in the process to import threading. Then that background thread becomes the "main thread" for threading, with the same results as described above.

The high-level problem, I think, is that there's an inconsistency between what Python itself considers "main thread" (i.e. main_thread in ceval.c, as set by PyEval_InitThreads), and what threading module considers "main thread" (i.e. _main_thread in threading.py). Logically, these should be in sync.

If PyEval_InitThreads is the source of truth, then the existing thread injection technique will "just work" as implemented already in all the aforementioned debuggers. It makes sure to invoke PyEval_InitThreads via Py_AddPendingCall, rather than directly from the background thread, precisely so that the interpreter doesn't get confused.

Furthermore, on 3.7+, PyEval_InitThreads is already automatically invoked by Py_Initialize, and hence when used by python.exe, will mark the actual first thread in the process as the main thread. So, using it a the source of truth would guarantee that attach by thread injection works correctly in all non-embedded Python scenarios.

Apps hosting Python would still need to ensure that they always call Py_Initialize on what they want to be the main thread, as they already have to do; but they wouldn't need to worry about "import threading" anymore.

History
Date	User	Action	Args
2019-06-27 10:06:32	int19h	set	recipients: + int19h, fabioz, aldwinaldwin
2019-06-27 10:06:32	int19h	set	messageid: <1561629992.85.0.963801625789.issue37416@roundup.psfhosted.org>
2019-06-27 10:06:32	int19h	link	issue37416 messages
2019-06-27 10:06:31	int19h	create