Title: Hang on interpreter shutdown if daemon thread prints to stdout
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.5, Python 3.4
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: marienz, neologix, pitrou, python-dev, tim.peters
Priority: normal Keywords: patch

Created on 2015-01-24 13:52 by marienz, last changed 2015-04-13 17:53 by pitrou. This issue is now closed.

File name Uploaded Description Edit marienz, 2015-01-24 13:52
bufio_lock_finalizing.patch pitrou, 2015-03-08 18:51
bufio_lock_finalizing_fatal_error.patch pitrou, 2015-03-08 19:25
bufio_lock_finalizing_fatal_error2.patch pitrou, 2015-03-18 22:16
Messages (10)
msg234615 - (view) Author: Marien Zwart (marienz) * Date: 2015-01-24 13:52
A script spawning a single daemon thread calling print() in a loop (like the attached) will usually hang on shutdown in Python 3.4.2 and hg rev 8d802fb6ae32. Attaching gdb at that point shows the following:

(gdb) thread apply all bt

Thread 1 (Thread 0x7fd927d58700 (LWP 30274)):
#0  sem_wait () at ../sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005282fe in PyThread_acquire_lock_timed (lock=0x1c5ea30, microseconds=microseconds@entry=-1, 
    intr_flag=intr_flag@entry=0) at Python/thread_pthread.h:352
#2  0x0000000000528414 in PyThread_acquire_lock (lock=<optimized out>, waitflag=waitflag@entry=1)
    at Python/thread_pthread.h:556
#3  0x0000000000567e4c in _enter_buffered_busy (self=0x7fd927bc2b48) at ./Modules/_io/bufferedio.c:327
#4  buffered_flush (self=0x7fd927bc2b48, args=<optimized out>) at ./Modules/_io/bufferedio.c:874
#5  0x000000000042822a in PyObject_Call (func=0x7fd9277b69d8, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2086
#6  0x00000000004290e4 in call_function_tail (args=0x7fd927b8d048, callable=0x7fd9277b69d8) at Objects/abstract.c:2124
#7  callmethod (is_size_t=1, va=0x7fff5c6cf6b0, format=0x0, func=0x7fd9277b69d8) at Objects/abstract.c:2193
#8  _PyObject_CallMethodId_SizeT (o=<optimized out>, name=<optimized out>, format=0x0) at Objects/abstract.c:2279
#9  0x000000000042822a in PyObject_Call (func=0x7fd9277b6990, arg=<optimized out>, kw=<optimized out>)
    at Objects/abstract.c:2086
#10 0x0000000000428cc4 in call_function_tail (args=0x7fd927b8d048, callable=0x7fd9277b6990) at Objects/abstract.c:2124
#11 callmethod (is_size_t=0, va=0x7fff5c6cf7e0, format=0x5b9924 "", func=0x7fd9277b6990) at Objects/abstract.c:2193
#12 _PyObject_CallMethodId (o=o@entry=0x7fd927b5d3a8, name=name@entry=0x862b00 <PyId_flush>, 
    format=format@entry=0x5b9924 "") at Objects/abstract.c:2238
#13 0x000000000050a521 in flush_std_files () at Python/pylifecycle.c:488
#14 0x000000000050a5aa in Py_Finalize () at Python/pylifecycle.c:550
#15 0x000000000041fc92 in Py_Main (argc=-1, argv=0x1) at Modules/main.c:787
#16 0x000000000041be3c in main (argc=2, argv=<optimized out>) at ./Programs/python.c:69

The daemon thread has exited, and the main thread hangs trying to flush stdout.

I haven't fully tracked down what happens here, but I think it's this:

- daemon thread calls ENTER_BUFFERED on stdout
- daemon thread drops the GIL before writing to stdout
- main thread grabs the GIL and starts exiting
- main thread sets _Py_Finalizing, signaling daemon threads to exit
- main thread calls flush_std_files and drops the GIL
- daemon thread grabs the GIL and immediately exits, without reaching LEAVE_BUFFERED
- main thread deadlocks trying to ENTER_BUFFERED the same file

If that is what happens, I don't really see how to fix it (it's an example of daemon threads not releasing their resources, which the documentation warns about). But it's obviously unfortunate if merely writing to stdout/err is such a resource.
msg237551 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-08 18:42
Your analysis sounds right to me.
msg237552 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-08 18:51
A possible solution is to tweak the IO lock acquire on shutdown: only wait for a grace period instead of waiting indefinitely. Attached patch demonstrates the solution (and solves the deadlock issue in your script).
msg237553 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-08 19:05
(note that trying to print a warning or raise an error only makes things worse if the stream subject to deadlock is stderr)
msg237554 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-08 19:25
Attached patch is stricter and dumps a fatal error after the grace period.
msg237605 - (view) Author: Marien Zwart (marienz) * Date: 2015-03-09 08:47
I'm afraid I don't have real-world code I can confirm is fixed, since I reported this on behalf of someone on irc in #python.

I think I'd prefer the Py_FatalError version of the patch. It's definitely possible to see writes to stdout/stderr at this time. If I read the first version right, it'll succeed the ENTER_BUFFERED without actually grabbing the lock, which seems problematic. But failing ENTER_BUFFERED is probably also more problematic, as it will probably lead to another exception that Python'll try to write to stderr, which fails the same way... If stdout/stderr are in a broken state during shutdown, Python is probably better off calling Py_FatalError rather than intermittently discarding messages.
msg237615 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-09 09:18
Thank you for the feedback. Yes, I now also think the fatal error is the preferrable way. I'll see if I can add a reliable test.
msg238477 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-03-18 22:16
Here is a patch with tests.
msg240681 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-13 17:49
New changeset 2c53a5302058 by Antoine Pitrou in branch '3.4':
Issue #23309: Avoid a deadlock at shutdown if a daemon thread is aborted

New changeset 8a3da91737cf by Antoine Pitrou in branch 'default':
Issue #23309: Avoid a deadlock at shutdown if a daemon thread is aborted
msg240684 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-04-13 17:53
Fix pushed! closing.
Date User Action Args
2015-04-13 17:53:25pitrousetstatus: open -> closed
resolution: fixed
messages: + msg240684

stage: patch review -> resolved
2015-04-13 17:49:46python-devsetnosy: + python-dev
messages: + msg240681
2015-03-18 22:16:28pitrousetfiles: + bufio_lock_finalizing_fatal_error2.patch

messages: + msg238477
2015-03-09 09:18:34pitrousetmessages: + msg237615
2015-03-09 08:47:36marienzsetmessages: + msg237605
2015-03-08 19:25:41pitrousetfiles: + bufio_lock_finalizing_fatal_error.patch

messages: + msg237554
2015-03-08 19:05:51pitrousetmessages: + msg237553
2015-03-08 18:51:08pitrousetfiles: + bufio_lock_finalizing.patch

nosy: + neologix
messages: + msg237552

keywords: + patch
stage: patch review
2015-03-08 18:42:14pitrousetnosy: + tim.peters
messages: + msg237551
2015-03-08 16:06:42BreamoreBoysetnosy: + pitrou
2015-01-24 13:52:40marienzcreate