classification
Title: trip_signal() gets NULL tstate on Windows on CTRL+C
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Alexander Riccio, neonene, vstinner
Priority: normal Keywords: patch

Created on 2020-03-27 06:07 by Alexander Riccio, last changed 2020-04-08 21:36 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19441 merged vstinner, 2020-04-08 20:58
Messages (7)
msg365134 - (view) Author: Alexander Riccio (Alexander Riccio) * Date: 2020-03-27 06:07
While trying to make sense of some static analysis warnings for the Windows console IO module, I Ctrl+C'd in the middle of an intentionally absurd __repr__ output, and on proceeding in the debugger (which treated it as an exception), I immediately hit the assertion right here:

    /* Get the Python thread state using PyGILState API, since
       _PyThreadState_GET() returns NULL if the GIL is released.
       For example, signal.raise_signal() releases the GIL. */
    PyThreadState *tstate = PyGILState_GetThisThreadState();
    assert(tstate != NULL);

...With the stacktrace: 
 	ucrtbased.dll!issue_debug_notification(const wchar_t * const message) Line 28	C++
 	ucrtbased.dll!__acrt_report_runtime_error(const wchar_t * message) Line 154	C++
 	ucrtbased.dll!abort() Line 51	C++
 	ucrtbased.dll!common_assert_to_stderr_direct(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number) Line 161	C++
 	ucrtbased.dll!common_assert_to_stderr<wchar_t>(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number) Line 175	C++
 	ucrtbased.dll!common_assert<wchar_t>(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number, void * const return_address) Line 420	C++
 	ucrtbased.dll!_wassert(const wchar_t * expression, const wchar_t * file_name, unsigned int line_number) Line 443	C++
>	python39_d.dll!trip_signal(int sig_num) Line 266	C
 	python39_d.dll!signal_handler(int sig_num) Line 342	C
 	ucrtbased.dll!ctrlevent_capture(const unsigned long ctrl_type) Line 206	C++
 	KernelBase.dll!_CtrlRoutine@4()	Unknown
 	kernel32.dll!@BaseThreadInitThunk@12()	Unknown
 	ntdll.dll!__RtlUserThreadStart()	Unknown
 	ntdll.dll!__RtlUserThreadStart@8()	Unknown


...I'm not entirely sure why this happened, but I can tell a few things. _PyRuntime.gilstate.autoInterpreterState is NOT null, in fact the gilstate object is as displayed in my watch window: 

-		_PyRuntime.gilstate	{check_enabled=1 tstate_current={_value=0 } getframe=0x79e3a570 {python39_d.dll!threadstate_getframe(_ts *)} ...}	_gilstate_runtime_state
		check_enabled	1	int
+		tstate_current	{_value=0 }	_Py_atomic_address
		getframe	0x79e3a570 {python39_d.dll!threadstate_getframe(_ts *)}	_frame *(*)(_ts *)
-		autoInterpreterState	0x00e5eff8 {next=0x00000000 <NULL> tstate_head=0x00e601c0 {prev=0x00000000 <NULL> next=0x00000000 <NULL> ...} ...}	_is *
+		next	0x00000000 <NULL>	_is *
+		tstate_head	0x00e601c0 {prev=0x00000000 <NULL> next=0x00000000 <NULL> interp=0x00e5eff8 {next=0x00000000 <NULL> ...} ...}	_ts *
+		runtime	0x7a0e2118 {python39_d.dll!pyruntimestate _PyRuntime} {preinitializing=0 preinitialized=1 core_initialized=...}	pyruntimestate *
		id	0	__int64
		id_refcount	-1	__int64
		requires_idref	0	int
		id_mutex	0x00000000	void *
		finalizing	0	int
+		ceval	{tracing_possible=0 eval_breaker={_value=0 } pending={finishing=0 lock=0x00e59390 calls_to_do={_value=...} ...} }	_ceval_state
+		gc	{trash_delete_later=0x00000000 <NULL> trash_delete_nesting=0 enabled=1 ...}	_gc_runtime_state
+		modules	0x00bf1228 {ob_refcnt=3 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
+		modules_by_index	0x00750058 {ob_refcnt=1 ob_type=0x7a0b8210 {python39_d.dll!_typeobject PyList_Type} {ob_base={ob_base=...} ...} }	_object *
+		sysdict	0x00bf1298 {ob_refcnt=2 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
+		builtins	0x00bf1f48 {ob_refcnt=88 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
+		importlib	0x00c0df60 {ob_refcnt=28 ob_type=0x7a0b92d0 {python39_d.dll!_typeobject PyModule_Type} {ob_base={ob_base=...} ...} }	_object *
		num_threads	0	long
		pythread_stacksize	0	unsigned int
+		codec_search_path	0x00c4a260 {ob_refcnt=1 ob_type=0x7a0b8210 {python39_d.dll!_typeobject PyList_Type} {ob_base={ob_base=...} ...} }	_object *
+		codec_search_cache	0x00c1f0d8 {ob_refcnt=1 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
+		codec_error_registry	0x00c14f10 {ob_refcnt=1 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
		codecs_initialized	1	int
+		fs_codec	{encoding=0x00e5aa40 "utf-8" utf8=1 errors=0x00e89ea8 "surrogatepass" ...}	<unnamed-tag>
+		config	{_config_init=2 isolated=0 use_environment=1 ...}	PyConfig
+		dict	0x00000000 <NULL>	_object *
+		builtins_copy	0x00c00a08 {ob_refcnt=1 ob_type=0x7a0b1178 {python39_d.dll!_typeobject PyDict_Type} {ob_base={ob_base=...} ...} }	_object *
+		import_func	0x00bfd900 {ob_refcnt=4 ob_type=0x7a0b90d0 {python39_d.dll!_typeobject PyCFunction_Type} {ob_base={ob_base=...} ...} }	_object *
		eval_frame	0x79a52577 {python39_d.dll!__PyEval_EvalFrameDefault}	_object *(*)(_ts *, _frame *, int)
		co_extra_user_count	0	int
+		co_extra_freefuncs	0x00e5f308 {0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, ...}	void(*)(void *)[255]
		pyexitfunc	0x79af7320 {python39_d.dll!atexit_callfuncs(_object *)}	void(*)(_object *)
+		pyexitmodule	0x00f5d690 {ob_refcnt=16 ob_type=0x7a0b92d0 {python39_d.dll!_typeobject PyModule_Type} {ob_base={ob_base=...} ...} }	_object *
		tstate_next_unique_id	1	unsigned __int64
+		warnings	{filters=0x0078edc8 {ob_refcnt=3 ob_type=0x7a0b8210 {python39_d.dll!_typeobject PyList_Type} {ob_base=...} } ...}	_warnings_runtime_state
+		audit_hooks	0x00000000 <NULL>	_object *
+		parser	{listnode={level=0 atbol=0 } }	<unnamed-tag>
+		small_ints	0x00e5f734 {0x00755868 {ob_base={ob_base={ob_refcnt=2 ob_type=0x7a0b8738 {python39_d.dll!_typeobject PyLong_Type} {...} } ...} ...}, ...}	_longobject *[262]
+		autoTSSkey	{_is_initialized=1 _key=5 }	_Py_tss_t


This looks to me like there's some kind of race condition in the thread local storage. Later, when TlsGetValue is called in PyThread_tss_get with the value of 5, the error code is lost, so I don't even know the exact reported error, and apparently 0 (nullptr) is a valid return anyways *shrug*. If I'm correctly decoding the address of the TLS slot in the TEB (I think *(unsigned int*)(void*)((@fs+0xe10)+ (20)) is the correct address for the 5th item? 4 bytes, key=5?), then there is actually a null tstate there. Not sure why.

This is in a relatively recent (<1wk old) branch of the code, with some non-behavioral tweaks to add annotations in while I'm bug hunting, so it shouldn't matter. I'm not sure if I can reproduce this, but it happened, so there's a bug somewhere.


What else? The signal number that tripped this was 2, sig interrupt, which makes sense.

There are other threads active, so maybe that's why? The TLS was never initialized for that thread? Here's the dump from the visual studio threads window, 22316 is the active thread.

Not Flagged		10360	0	Main Thread	Main Thread	python39_d.dll!_PyOS_WindowsConsoleReadline
 	 	 	 	 	 	ntdll.dll!_NtDeviceIoControlFile@40()
 	 	 	 	 	 	KernelBase.dll!ConsoleCallServerGeneric()
 	 	 	 	 	 	KernelBase.dll!_ReadConsoleInternal@24()
 	 	 	 	 	 	KernelBase.dll!_ReadConsoleW@20()
 	 	 	 	 	 	python39_d.dll!_PyOS_WindowsConsoleReadline(void * hStdIn) Line 120
 	 	 	 	 	 	python39_d.dll!PyOS_StdioReadline(_iobuf * sys_stdin, _iobuf * sys_stdout, const char * prompt) Line 253
 	 	 	 	 	 	python39_d.dll!PyOS_Readline(_iobuf * sys_stdin, _iobuf * sys_stdout, const char * prompt) Line 358
 	 	 	 	 	 	python39_d.dll!tok_nextc(tok_state * tok) Line 856
 	 	 	 	 	 	python39_d.dll!tok_get(tok_state * tok, const char * * p_start, const char * * p_end) Line 1166
 	 	 	 	 	 	python39_d.dll!PyTokenizer_Get(tok_state * tok, const char * * p_start, const char * * p_end) Line 1813
 	 	 	 	 	 	python39_d.dll!parsetok(tok_state * tok, grammar * g, int start, perrdetail * err_ret, int * flags) Line 253
 	 	 	 	 	 	python39_d.dll!PyParser_ParseFileObject(_iobuf * fp, _object * filename, const char * enc, grammar * g, int start, const char * ps1, const char * ps2, perrdetail * err_ret, int * flags) Line 188
 	 	 	 	 	 	python39_d.dll!PyParser_ASTFromFileObject(_iobuf * fp, _object * filename, const char * enc, int start, const char * ps1, const char * ps2, PyCompilerFlags * flags, int * errcode, _arena * arena) Line 1388
 	 	 	 	 	 	python39_d.dll!PyRun_InteractiveOneObjectEx(_iobuf * fp, _object * filename, PyCompilerFlags * flags) Line 240
 	 	 	 	 	 	python39_d.dll!PyRun_InteractiveLoopFlags(_iobuf * fp, const char * filename_str, PyCompilerFlags * flags) Line 122
 	 	 	 	 	 	python39_d.dll!PyRun_AnyFileExFlags(_iobuf * fp, const char * filename, int closeit, PyCompilerFlags * flags) Line 81
 	 	 	 	 	 	python39_d.dll!pymain_run_stdin(PyConfig * config, PyCompilerFlags * cf) Line 467
 	 	 	 	 	 	python39_d.dll!pymain_run_python(int * exitcode) Line 556
 	 	 	 	 	 	python39_d.dll!Py_RunMain() Line 632
 	 	 	 	 	 	python39_d.dll!pymain_main(_PyArgv * args) Line 663
 	 	 	 	 	 	python39_d.dll!Py_Main(int argc, wchar_t * * argv) Line 674
 	 	 	 	 	 	python_d.exe!wmain(int argc, wchar_t * * argv) Line 10
 	 	 	 	 	 	python_d.exe!invoke_main() Line 90
 	 	 	 	 	 	python_d.exe!__scrt_common_main_seh() Line 288
 	 	 	 	 	 	python_d.exe!__scrt_common_main() Line 331
 	 	 	 	 	 	python_d.exe!wmainCRTStartup() Line 17
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged		14944	0	Worker Thread	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20
 	 	 	 	 	 	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20()
 	 	 	 	 	 	ntdll.dll!_TppWorkerThread@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged	>	22316	0	Worker Thread	KernelBase.dll!_CtrlRoutine@4()	ucrtbased.dll!issue_debug_notification
 	 	 	 	 	 	ucrtbased.dll!issue_debug_notification(const wchar_t * const message) Line 28
 	 	 	 	 	 	ucrtbased.dll!__acrt_report_runtime_error(const wchar_t * message) Line 154
 	 	 	 	 	 	ucrtbased.dll!abort() Line 51
 	 	 	 	 	 	ucrtbased.dll!common_assert_to_stderr_direct(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number) Line 161
 	 	 	 	 	 	ucrtbased.dll!common_assert_to_stderr<wchar_t>(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number) Line 175
 	 	 	 	 	 	ucrtbased.dll!common_assert<wchar_t>(const wchar_t * const expression, const wchar_t * const file_name, const unsigned int line_number, void * const return_address) Line 420
 	 	 	 	 	 	ucrtbased.dll!_wassert(const wchar_t * expression, const wchar_t * file_name, unsigned int line_number) Line 443
 	 	 	 	 	 	python39_d.dll!trip_signal(int sig_num) Line 266
 	 	 	 	 	 	python39_d.dll!signal_handler(int sig_num) Line 342
 	 	 	 	 	 	ucrtbased.dll!ctrlevent_capture(const unsigned long ctrl_type) Line 206
 	 	 	 	 	 	KernelBase.dll!_CtrlRoutine@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged		10180	0	Worker Thread	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20
 	 	 	 	 	 	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20()
 	 	 	 	 	 	ntdll.dll!_TppWorkerThread@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged		28940	0	Worker Thread	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20
 	 	 	 	 	 	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20()
 	 	 	 	 	 	ntdll.dll!_TppWorkerThread@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged		9396	0	Worker Thread	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20
 	 	 	 	 	 	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20()
 	 	 	 	 	 	ntdll.dll!_TppWorkerThread@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()

Not Flagged		28960	0	Worker Thread	ntdll.dll!_TppWorkerThread@4()	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20
 	 	 	 	 	 	ntdll.dll!_NtWaitForWorkViaWorkerFactory@20()
 	 	 	 	 	 	ntdll.dll!_TppWorkerThread@4()
 	 	 	 	 	 	kernel32.dll!@BaseThreadInitThunk@12()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart()
 	 	 	 	 	 	ntdll.dll!__RtlUserThreadStart@8()



Anyways, I hope that's a useful report for an obscure bug.
msg365135 - (view) Author: Alexander Riccio (Alexander Riccio) * Date: 2020-03-27 06:08
Lmao the name mangling comes up as a mailto. That's interesting.
msg365136 - (view) Author: Alexander Riccio (Alexander Riccio) * Date: 2020-03-27 06:13
Hmmm, happens every time I interrupt while attached. Is there some obvious gotcha in the docs that I'm missing?
msg365879 - (view) Author: neonene (neonene) * Date: 2020-04-06 21:04
On Windows, PyGILState_GetThisThreadState() returns NULL when ^C-interrupt occurs. It is from TlsGetValue() winAPI and I don't think the os's behevior is wrong. 
In trip_signal(), crash can be avoided by skipping PyEval_SignalReceived()  if tstate is invalid. But I'm not sure the skip itself is ok.
msg365995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-08 17:43
Oh oh. This issue is quite annoying for my work on subinterpreters.

I introduced this bug when I moved pending calls from _PyRuntimeState to PyInterpreterState in bpo-39984. _PyEval_AddPendingCall() now requires tstate to add a function to pending calls of the proper interpreter.

The problem on Windows is that each CTRL+c is executed in a different thread. Here is a modified Python 3.8 which dumps the thread identifier ("tid") at startup and when trip_signal() is triggered by CTRL+C:

vstinner@WIN C:\vstinner\python\3.8>python
Running Release|x64 interpreter...
pymain_main: tid=1788
Python 3.8.1+ (heads/3.8-dirty:19be85c765, Apr  8 2020, 19:35:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> ^C
trip_signal: tid=6996 tstate=0000000000000000
KeyboardInterrupt

>>> ^C
trip_signal: tid=2384 tstate=0000000000000000
KeyboardInterrupt

>>> ^C
trip_signal: tid=32 tstate=0000000000000000
KeyboardInterrupt

When trip_signal() is called, PyGILState_GetThisThreadState() returns NULL.
msg366011 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-08 21:35
New changeset b54a99d6432de93de85be2b42a63774f8b4581a0 by Victor Stinner in branch 'master':
bpo-40082: trip_signal() uses the main interpreter (GH-19441)
https://github.com/python/cpython/commit/b54a99d6432de93de85be2b42a63774f8b4581a0
msg366012 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-08 21:36
Thanks for the bug report Alexander Riccio. I fixed bug in master. Python 3.8 is not affected.
History
Date User Action Args
2020-04-08 21:36:15vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg366012

stage: patch review -> resolved
2020-04-08 21:35:11vstinnersetmessages: + msg366011
2020-04-08 20:58:08vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request18795
2020-04-08 17:43:53vstinnersettitle: Assertion failure in trip_signal -> trip_signal() gets NULL tstate on Windows on CTRL+C
2020-04-08 17:43:34vstinnersetmessages: + msg365995
2020-04-06 21:10:23vstinnersetnosy: + vstinner
2020-04-06 21:04:38neonenesetnosy: + neonene
messages: + msg365879
2020-03-27 06:13:03Alexander Ricciosetmessages: + msg365136
2020-03-27 06:08:22Alexander Ricciosetmessages: + msg365135
2020-03-27 06:07:12Alexander Ricciocreate