classification
Title: Tkinter hangs if using multiple threads and event handlers
Type: crash Stage: resolved
Components: Tkinter Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Ivan.Pozdeev, serhiy.storchaka, terry.reedy
Priority: normal Keywords:

Created on 2018-05-02 17:51 by Ivan.Pozdeev, last changed 2018-05-06 21:20 by Ivan.Pozdeev. This issue is now closed.

Files
File name Uploaded Description Edit
TkinterHanders3.py Ivan.Pozdeev, 2018-05-02 17:51
trace.zip Ivan.Pozdeev, 2018-05-02 17:52 trace log
trace.py Ivan.Pozdeev, 2018-05-02 17:52 modified Lib/trace.py to show thread IDs
TkinterHanders32.py Ivan.Pozdeev, 2018-05-04 02:34 fixed script
Messages (11)
msg316082 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-02 17:51
With threaded Tkinter, TkinterHanders3.py from https://bugs.python.org/issue33257 (attached) hangs.

Tracing with thread_debug and a modified trace.py (to show TIDs, attached) shows that worker threads are waiting for the Tcl lock while the main thread that holds it keeps waiting for some other lock with a strange timeout:

19000: PyThread_acquire_lock_timed(00000000001B0F80, 0) called
19000: PyThread_acquire_lock(00000000001B0F80, 0) -> 0
19000: PyThread_acquire_lock_timed(00000000001B0F80, -1000000) called

Tested on 3.6 head, win7 x64, debug build.
msg316083 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-02 18:07
> worker threads are waiting for the Tcl lock

Pardon. They are waiting for Tkapp_ThreadSend()s into the main thread to return. The effect is still the same.
msg316090 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-05-02 23:27
So it seems threads and Tkinter events don't mix. This doesn't surprise me much. (Similar issues can occur when mixing threads and asyncio if you don't follow the documentation's advice about how to send events across threads.)

Perhaps event_generate() needs to be more careful with locking?

Do you have a suggestion for what to do short of dropping Tkinter support?
msg316102 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-03 03:05
> 
Do you have a suggestion for what to do short of dropping Tkinter support?

Didn't really look into this.
At first glance, from the trace log, the main thread seems to grab a lock at some initial point, and then tries to grab it again when running an event handler. So making the lock reentrant may help.
msg316104 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-05-03 03:30
I guess nobody gives a damn.
msg316132 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-05-03 19:43
From 1994 to 2017, _tkinter.c has received a steady flow of multiple revisions each year, for 333 total, by maybe 20 people including Guido.  This makes it one of the more actively maintained files and indicates the opposite of indifference and not caring.

I tested event generation from threads on my Win 10 machine.  With installed 2.7, with non-thread tcl 8.5, the process hangs as described, after 1 key event is sent and received.  So we should document "don't do that".

With installed 64-bit 3.6 with thread-compiled tcl 8.6, I see something completely different.  The process runs for 5 seconds until the stop call.  The two threads alternate sending key events, which are all received in the main thread.  Ditto for built 32-big debug 3.6 and 3.8.  

The only problem is that the first t.join() hangs because of a thread deadlock bug.  t.join() blocks until t.run exits.  t.run does not exit until the last event_generate, with running=False, returns.  But that blocks until dummy_handler runs.  Dummy_handler does not run when the main thread is blocked by t.join.

I fixed this by not generating events when running is False.  The revised program exits properly.

    def run(self):
        tid = self.ident
        while True:
            time.sleep(0.02)
            c = random.choice(string.ascii_letters)
            print(running, "%d: sending '%s'"%(tid,c),file=sys.stderr)
            if running:
                self.target.event_generate(c)
            else:
                break

I suppose there is a teeny possibility that 'running' could be flipped between the test and the call.  Can that be prevented with a lock?

Another possibility is for stop() to change conditions so that 'self.target.event_generate(c)' fails with an exception, and change if/else to for/except.  My first try sort of works in IDLE but not the console.
msg316151 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-04 01:09
> Another possibility is for stop() to change conditions so that 'self.target.event_generate(c)' fails with an exception

Could you elaborate? Since there're no docs on event_generate(), I can't look up how to make it "fail with an exception" without actually posting an event.


> The only problem is that the first t.join() hangs because of a thread deadlock bug.  t.join() blocks until t.run exits.  t.run does not exit until the last event_generate, with running=False, returns.  But that blocks until dummy_handler runs.
> I suppose there is a teeny possibility that 'running' could be flipped between the test and the call.  Can that be prevented with a lock?

The idea is to let the worker threads finish their work, not terminate them forcibly.
So the real problem is that stop() blocks the event loop.
It should rather run asynchronously, wait for threads, then trigger `self.root.destroy()` in the main thread... somehow.
msg316152 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-04 02:34
Attached a fixed script.

`Tk.after()` works from a worker thread, while `Tk.destroy()` doesn't.

That's because Tkinter implements Tcl calls (_tkinter.c:Tkapp_Call) from another thread by posting an event to the interpreter's queue (Tcl_ThreadQueueEvent) and waiting for result. So a call normally works, but would hang if the interpreter's event loop is not running.

`destroy()`'s Python part (Lib\tkinter\__init__.py:2055) stops the event loop, then makes more Tcl calls -- which hang for the aforementioned reason if made from another thread.
msg316225 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-05-06 05:56
When I closed the main thread by clicking [x], thus destroying root, both event threads raised instead of hanging.  So my experiment involved calling root.destroy instead of setting running to False.  The better result when running under IDLE might be due to IDLE's run process executes tcl.call('update') about 20 times per second.  Even if the exception idea could be made to work, it seems like a kludge.

Waiting on the event threads from a separate non-gui thread and leaving the main thread fully functional and responsive until the gui threads die seems much cleaner.  Perhaps this should be recommended as a standard way to shut down the main thread when there might be active gui threads.  Thank you for following through with this to get a solution we both like.
msg316242 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-05-06 19:20
Without thread support, event generation from multiple threads fails immediately.  I tried an experiment with callback scheduling.  It seems to work -- almost.

thread_event.py runs on 2.7 with non-t tcl.  It modifies TkinterHandlres32.py by replacing
            self.target.event_generate(c)
with
            self.target.after(1, lambda t=self.target: t.event_generate(c))
to schedule the event generation in the main thread.
It also imports either tkinter or Tkinter, and runs for 10 seconds
        self.root.after(10000,self.stop)
for a more rigorous test.

However, when I add 2 0s to the delay, to make it 1000 seconds, the main thread and gui crash sometime sooner (100 seconds, say), leaving the worker threads sending indefinitely.  One time there was a traceback:

Traceback (most recent call last):
  File "F:\dev\tem\thread_event.py", line 55, in <module>
    Main().go()
  File "F:\dev\tem\thread_event.py", line 35, in go
    self.t_cleanup.join()
AttributeError: 'Main' object has no attribute 't_cleanup'

A second time, nothing appeared.

I suspect that without proper locking an .after call was eventually interrupted and the pending scheduled callback data structure corrupted. Mainloop exits without t_cleanup created.
msg316243 - (view) Author: Ivan Pozdeev (Ivan.Pozdeev) * Date: 2018-05-06 21:20
> Without thread support, event generation from multiple threads fails immediately.

This ticket is for threaded Tcl only, so this is off topic.

In nonthreaded Tcl, this script crashes rather than freezes, for an entire ly different reason that I already explained in https://bugs.python.org/issue33257 .

This ticket is solved if you ask me.
The only remaining matter is that there's no documentation:

* on Tkinter threading model: https://docs.python.org/3/library/tk.html claims full thread safety which is untrue.
* on best practices with Tkinter: as you could see, all the more or less obvious solutions are flawed. (That includes my solution: the program doesn't terminate gracefully if you close the window by hand.)

I'm going to cover at least the first item as part of executing Guido's suggestion to "add a warning to the docs".
History
Date User Action Args
2018-05-06 21:20:10Ivan.Pozdeevsetmessages: + msg316243
2018-05-06 19:20:11terry.reedysetmessages: + msg316242
2018-05-06 05:56:53terry.reedysetmessages: + msg316225
2018-05-04 03:44:32gvanrossumsetnosy: - gvanrossum
2018-05-04 02:34:40Ivan.Pozdeevsetfiles: + TkinterHanders32.py

messages: + msg316152
2018-05-04 01:09:30Ivan.Pozdeevsetresolution: wont fix -> not a bug
messages: + msg316151
2018-05-03 19:43:26terry.reedysetnosy: + terry.reedy, serhiy.storchaka
messages: + msg316132
2018-05-03 03:30:05gvanrossumsetstatus: open -> closed
resolution: wont fix
messages: + msg316104

stage: resolved
2018-05-03 03:05:58Ivan.Pozdeevsetmessages: + msg316102
2018-05-02 23:27:56gvanrossumsetnosy: + gvanrossum
messages: + msg316090
2018-05-02 18:07:47Ivan.Pozdeevsetmessages: + msg316083
2018-05-02 17:52:45Ivan.Pozdeevsetfiles: + trace.py
2018-05-02 17:52:07Ivan.Pozdeevsetfiles: + trace.zip
2018-05-02 17:51:49Ivan.Pozdeevcreate