classification
Title: Clear state of threads earlier in Python shutdown
Type: Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, neologix, pitrou, python-dev, serhiy.storchaka, vajrasky, vstinner
Priority: normal Keywords: patch

Created on 2013-10-31 23:39 by vstinner, last changed 2014-03-17 06:31 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
finalize_threads.patch vstinner, 2013-10-31 23:39 review
finalize_threads-2.patch vstinner, 2013-11-04 20:56 review
finalize_threads-3.patch vstinner, 2013-11-12 02:13 review
fix_typo_19466.patch vajrasky, 2013-11-12 16:17 review
Messages (22)
msg201860 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 23:39
Each Python thread holds references to objects, in its current frame for example. At Python shutdown, clearing threads state happens very late: the import machinery is already dead, types are finalized, etc. If a thread has an object with a destructor, bad things may happen.

For example, when open files are destroyed, a warning is emitted. The problem is that you cannot emits warnings because the warnings module has been unloaded, and so you miss warnings. See the issue #19442 for this specific case.

It is possible to clear the Python threads earlier since Python 3.2, because Python threads will now exit cleanly when they try to acquire the GIL: see PyEval_RestoreThread(). The value of the tstate pointer is used in PyEval_RestoreThread(), but the pointer is not dereferenced (the content of a Python state is not read). So it is even possible to free the memory of the threads state (not only to clear the threads state).

Attached patch implements destroy the all threads except the current thread, and clear the state of the current thread.

The patch calls also the garbage collector before flushing stdout and stderr, and disable signal handlers just before PyImport_Cleanup(). The main idea is to reorder functions like this:

- clear state of all threads to release strong references -> may call destructores
- force a garbage collector to release more references -> may call destructores
- flush stdout and stderr -> write pending warnings and any other buffered messages
- disable signal handler -> only at the end so previous steps can still be interrupted by CTRL+c

The side effect of the patch is that the destructor of destroyed objects are now called, especially for daemon threads. If you try the warn_shutdown.py script attached to #19442, you now get the warning with the patch! As a result, test_threading.test_4_daemon_threads() is also modified by my patch to ignore warnings :-)

If I understood correctly, the patch only has an impact on daemon threads, the behaviour of classic threads is unchanged.
msg202155 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-04 19:07
Is it possible to write a test for this behavior?
msg202156 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-04 19:12
I'm afraid clearing thread states is a bit too brutal. What if some destructor relies on contents of the thread states (e.g. thread locals)?
msg202164 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-04 20:22
2013/11/4 Antoine Pitrou <report@bugs.python.org>:
> I'm afraid clearing thread states is a bit too brutal. What if some destructor relies on contents of the thread states (e.g. thread locals)?

When Py_Finalize() is called, only one Python thread hold the GIL.
After _Py_Finalizing=tstate is set, no other thread can hold the GIL.
If another Python tries to lock the GIL, it is "killed" by
PyEval_RestoreThread().

Is that correct? If yes, in which thread would the destructor be
called? Would it read Python thread locals without holding the GIL?
msg202165 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-04 20:24
2013/11/4 Serhiy Storchaka <report@bugs.python.org>:
> Is it possible to write a test for this behavior?

It is possible to test it manually using warn_shutdown.py attached to
#19442. The warnings module may be used to test the change.
msg202166 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-04 20:28
I am referring to the following part of your patch:

+    /* Clear the state of the current thread */
+    PyThreadState_Clear(tstate);

You are clearing the thread state of the currently executing thread, which doesn't sound right.
msg202169 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-04 20:56
> You are clearing the thread state of the currently executing
> thread, which doesn't sound right.

Oh, I didn't realize that PyThreadState_Clear() clears also Python thread locals.

Here is a new patch without PyThreadState_Clear(tstate) and with two unit tests:

- ensure that Python thread locals are not destroyed before destructors are called
- ensure that object destructors are called before Python thread states are destroyed
msg202667 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-12 02:13
neologix made a review on rietveld:
http://bugs.python.org/review/19466/#ps9818

There was an issue in finalize_threads-2.patch: the test didn't pass in release mode. I fixed it by adding -Wd to the command line option.

I also replaced threading.Lock with threading.Event in the unit test to synchronize the two threads.

=> finalize_threads-3.patch
msg202699 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-12 15:45
New changeset c2a13acd5e2b by Victor Stinner in branch 'default':
Close #19466: Clear the frames of daemon threads earlier during the Python
http://hg.python.org/cpython/rev/c2a13acd5e2b
msg202704 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-11-12 16:17
Typo:

s/immediatly/immediately.

I attached the patch.
msg202705 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-12 16:19
New changeset 10a8e676b87b by Victor Stinner in branch 'default':
Issue #19466: Fix typo. Patch written by Vajrasky Kok.
http://hg.python.org/cpython/rev/10a8e676b87b
msg202912 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-15 00:47
The fix raised a new issue: #19565.

I also saw a crash on Windows which is probably related:

http://buildbot.python.org/all/builders/x86 Windows Server 2003 [SB] 3.x/builds/1717/steps/test/logs/stdio 

======================================================================
FAIL: test_4_daemon_threads (test.test_threading.ThreadJoinOnShutdown)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_threading.py", line 783, in test_4_daemon_threads
    rc, out, err = assert_python_ok('-c', script)
  File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\script_helper.py", line 69, in assert_python_ok
    return _assert_python(True, *args, **env_vars)
  File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\script_helper.py", line 55, in _assert_python
    "stderr follows:\n%s" % (rc, err.decode('ascii', 'ignore')))
AssertionError: Process return code is 3221225477, stderr follows:


----------------------------------------------------------------------
msg206009 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-13 01:47
I'm unable to reproduce the test_4_daemon_threads failure, can anyone try on Windows?
msg206028 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-13 08:30
Hum...
Correct me if I'm wrong, but destroying the thread state of daemon threads
while they're running is really a bad idea in fact: for example, if
warnings are now emitted for unclosed file objects, this means that the
file object, and all associated buffers, are destroyed.
But even though the daemon thread doesn't hold the GIL, he might still be
using this object.

For example, if we have:
daemon thread:
readinto:
    release_gil
    read(fd, fileobj->buffer, fileobj->size)
    acquire_gil

and the main thread exits, then fileobj will be deallocated.

It's actually fairly easy to write a short crasher launching a deamon
thread reading from e.g. /dev/zero in loop, with the main thread exiting
right in the middle.
msg206032 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-13 09:35
"It's actually fairly easy to write a short crasher launching a deamon
thread reading from e.g. /dev/zero in loop, with the main thread exiting
right in the middle."

I'm unable to write such crasher, can you please give an example? Are you able to crash Python on Linux or on Python 3.3?

Does the threading test makes sense?
msg210993 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-11 17:52
I didn't see test_threading failing anymore recently, so I'm closing the issue. Thanks.
msg210998 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-02-11 18:55
> I didn't see test_threading failing anymore recently, so I'm closing the
issue. Thanks.

Hm...

Could someone review my message http://bugs.python.org/msg206028?

Because AFAICT, this will lead to random crashes with daemon threads.
msg211011 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-11 20:33
> Could someone review my message http://bugs.python.org/msg206028?

I replied with questions and you didn't reply :-)
msg211072 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-12 09:55
A new bug was found because of changes of this issue: #20526. IMO it is safer to revert changes of this issue in Python 3.4, it's too late to analyze such tricky bug. See #20526 for the discussion.
msg211149 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-13 11:52
New changeset 1166b3321012 by Victor Stinner in branch 'default':
Issue #20526, #19466: Revert changes of issue #19466 which introduces a
http://hg.python.org/cpython/rev/1166b3321012
msg211154 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-02-13 13:35
I reverted the changes because it introduced regressions.
msg213823 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-17 06:31
New changeset 9ce58a73b6b5 by Victor Stinner in branch '3.4':
Issue #20526, #19466: Revert changes of issue #19466 which introduces a
http://hg.python.org/cpython/rev/9ce58a73b6b5
History
Date User Action Args
2014-03-17 06:31:03python-devsetmessages: + msg213823
2014-02-13 13:35:11vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg211154
2014-02-13 11:52:29python-devsetmessages: + msg211149
2014-02-12 09:55:27vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg211072
2014-02-11 20:33:45vstinnersetmessages: + msg211011
2014-02-11 18:55:56neologixsetmessages: + msg210998
2014-02-11 17:52:39vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg210993
2013-12-13 09:35:20vstinnersetmessages: + msg206032
2013-12-13 08:30:38neologixsetmessages: + msg206028
2013-12-13 01:47:42vstinnersetmessages: + msg206009
2013-11-15 00:47:29vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg202912
2013-11-12 16:19:00python-devsetmessages: + msg202705
2013-11-12 16:17:01vajraskysetfiles: + fix_typo_19466.patch
nosy: + vajrasky
messages: + msg202704

2013-11-12 15:45:41python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg202699

resolution: fixed
stage: resolved
2013-11-12 02:13:13vstinnersetfiles: + finalize_threads-3.patch

messages: + msg202667
2013-11-04 20:56:45vstinnersetfiles: + finalize_threads-2.patch

messages: + msg202169
2013-11-04 20:28:21pitrousetnosy: + neologix
messages: + msg202166
2013-11-04 20:24:01vstinnersetmessages: + msg202165
2013-11-04 20:22:12vstinnersetmessages: + msg202164
2013-11-04 19:12:17pitrousetmessages: + msg202156
2013-11-04 19:07:31serhiy.storchakasetmessages: + msg202155
2013-11-04 10:47:29vstinnersetnosy: + serhiy.storchaka
2013-11-01 00:38:57Arfreversetnosy: + Arfrever
2013-10-31 23:39:46vstinnercreate