New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race conditions in Tkinter with non-threaded Tcl #77438
Comments
(Marked only 2.7 as affected but this would affect any branch if built with nonthreaded Tcl.) When running the attached TkinterCrash2-2.py repeatedly with 2.7.14 and 2.7 head, win7 x64, two kinds of errors pop up randomly:
Exception in thread Thread-14:
Traceback (most recent call last):
File "C:\Users\Sasha\Documents\cpython\lib\threading.py", line 801, in __boots
trap_inner
self.run()
File "../tkt/TkinterCrash2-2.py", line 50, in run
self.deliverToqueue((self.target, z, y))
File "../tkt/TkinterCrash2-2.py", line 133, in arrival_122
new_yz[1])
File "C:\Users\Sasha\Documents\cpython\lib\lib-tk\Tkinter.py", line 2328, in create_line
return self._create('line', args, kw)
File "C:\Users\Sasha\Documents\cpython\lib\lib-tk\Tkinter.py", line 2310, in _create
*(args + self._options(cnf, kw))))
ValueError: invalid literal for int() with base 10: 'None' Running the same code with the minimal required changes (attached as TkinterCrash3-2-2.py) under 3.6 (same platform) goes without any errors. Diagnostics showed:
I've fixed this, will file the PR shortly. Holding the lock when calling Tkapp_CallDeallocArgs in Tkapp_Call eliminated the crashes and freezes, locking objects' creation in the same fn decreased the number of exceptions greatly, and locking creation in SetVar, too, eliminated them completely. I did not check if objects created with AsObj in SetVar need to be disposed of like in Tkapp_Call. Also have no idea how to autotest the fix. If someone does, I'm all ears. |
My best idea for a test as of now is to run the example ~20 times in a loop (or continuously for a comparable amount of time & threads) and catch and register any exceptions in all threads as per https://stackoverflow.com/questions/18349614/check-if-a-python-thread-threw-an-exception . |
I believe I have seen this exact error message within the last few months, and I found one example on SO. https://stackoverflow.com/questions/21209124/weird-error-that-comes-up-during-python-program-run |
@terry.reedy Apparently, this bug has gone unnoticed for years (to be precise, since b5bfb9f in 2002), |
Wait a second... I think I noticed another similar bug. In Tkapp_CallProc. Same case, Tkapp_CallArgs and Tkapp_CallDeallocArgs are called when not holding both locks. The attached example doesn't use Python event handlers, so it didn't manifest itself in my tests. |
Attached reproducing example for event handlers. If launching more than one EventThread, it abort()'s immediately. |
Finally dug up the reason but unsure how to fix it. Here's the trace. The format is `$TID: $FUNCTION [some of the args]' 0x1A34: Tkapp_Call(_object *, _object *) A second PythonCmd is started when the 1st one is still in progress, then the 1st one returns, the Tcl interpreter tries to unwind the topmost stack frame, and it's the wrong one. Denying other PythonCmd is out of question (a nested cmd deadlocks). Waiting for them to finish, either (would wait for potentially unlimited time). Look like the only way is to rearrange Tcl stack frames so that the right one is on top. (Since Tkinter's "interpreter calls" (sequences of Tcl calls done while holding the lock) are supposed to be independent from one another, it doesn't really matter which order the frames are in.) I seriously doubt that's possible with just the public interface though. |
Scratch that. Tkinter allows to execute entire scripts where the order of the frames is important. |
Here are the possible courses of action to fix this. All require design changes, thus can be shot down on a whim. So I'll ask for advice and approval at python-dev first, to be sure that my efforts won't be scrapped. First, some terminology I'll be using:
Now, the fix options proper:
Rejected options:
|
So this issue doesn't occur when linked with a thread-aware Tcl/Tk, right? Maybe we should just make sure that's the only configuration we ensure? |
This would break compatibility, including some usage patterns (see the |
I reproduced the error on installed 2.7.14, running tk 8.5.18. I ran TkinterCrash3-2-2.py 15 times on 64-bit Win10, installed x64 3.6.5, with tk 8.6.8, and experienced no crashes. I ran it another 10 times with 32-bit repository 3.8 debug and no crashes. My conclusion: this is at least partly a tcl/tk issue fixed in current tcl/tk. "this would affect any branch if built with nonthreaded Tcl." If python is delivered (on Windows and Mac) with 'threaded tcl' and rebuilding with 'nonthreaded tcl' causes a problem with tkinter, then one should not do that. In practice, essentially no one does so. Serhiy, do you know what type of tcl/tk is distributed on Linux? |
As I reported on the pydev thread "bpo-33257: seeking advice & approval on the course of action", Firefox cannot find the string 'thread' in the tkinter chapter of the official doc. Nor in the tkinter.ttk chapter. IDLE does not find it in tkinter.__init__. So the claim of 'thread-safe' seems not to be 'official'. As I understand 'crash', arrival_122, which calls Canvas.create_line, get called in each thread by Track.run. On the other pydev thread, Mac expert Ronald Oussorn said "at least on macOS calling GUI methods in Apple’s libraries from secondary threads is unsafe unless those methods are explicitly documented as thread-safe." Maybe we should officially say that tkinter/tk/os gui calls from secondary threads is chancey, even while we try to make it less so. Ivan, as I also said there, it is not clear to me, given your subsequent comments, what you consider to be the status of the PR. |
I wrote in the initial message that this bug only happens with nonthreaded Tcl, regardless of Python version and Tcl version. I've built Py2 with threaded Tcl 8.5.19.0 just to make sure, and indeed, it went away. |
Rerunning 2.7, I sometimes get the following almost immediately without clicking [launch]. Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Programs\Python27\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "f:/python/a/tem.py", line 50, in run
self.deliverToqueue((self.target, z, y))
File "f:/python/a/tem.py", line 133, in arrival_122
new_yz[1])
File "C:\Programs\Python27\lib\lib-tk\Tkinter.py", line 2322, in create_line
return self._create('line', args, kw)
File "C:\Programs\Python27\lib\lib-tk\Tkinter.py", line 2310, in _create
*(args + self._options(cnf, kw))))
TclError: bad option "97380032LrunLocal": must be addtag, bbox, bind, canvasx, canvasy, cget, configure, coords, create, dchars, delete, dtag, find, focus, gettags, icursor, index, insert, itemcget, itemconfigure, lower, move, postscript, raise, scale, scan, select, type, xview, or yview An automated test would need to fail more consistently, without needing to click. Running faster would also be good (and perhaps increase failure rate). The the test would be that the file runs in subprocess without an error (perhaps multiple times). There may be a test.support function for this. The TkinterCrash files currently have way too many comments for a non-beginner audience. The t |
The PR fixes the problem exposed by TkinterCrash2-2.py and TkinterCrash3-2-2.py and only lacks an autotest (I asked for any ideas for it and got no response). Then I found another related bug (https://bugs.python.org/issue33257#msg315286 ) and though to fix it as well -- but it turned out to be deeper than expected. I'd be fine to commit the current fix and work on the second bug separately. |
You did not say 'never with threaded tcl'. You also claimed that tkinter is so broken, with either threaded or non-threaded tcl/tk, that tkinter should be trashed. Anyway, I like to test things for myself. If 'never with threaded tcl' is true, then the crashes I observed indicate that CPython's current 2.7 windows installer is installing non-threaded 8.5.18. (I have no idea if or how I could check.) It would seem then, that we should change to threaded 8.5.18 before 2.7.16. (2.7.15 was just released). I am surprised since the Windows installers were in Martin Loewis' charge until sometime after 2.7 was released, and he is the one who claimed that tkinter should be thread-safe. |
Is threaded tcl just a compile switch, as with CPython? |
Thanks for the ideas. I'm on it. Was thinking of just wrapping the logic with unittest machinery but for a subprocess, it's easier to test stderr for exceptions.
https://bugs.python.org/issue33412
You can check this by seeing that the Tcl/Tk DLLs lack the 't' suffix.
|
Yes. In tcl.vcxproj, tk.vcxproj and tix.vcxproj, there are parameters named like TclOpts that are passed as OPTS arg to the lib's makefiles. For Tcl 8.5, you need to pass "threads" to build with threads; for 8.6, "nothreads" to build without. Also, in tcltk.props, the resulting DLL/LIB names are hardcoded with or without the 't' suffix. You'll need to fix that if you wish to switch the Tcl flavor. |
... Finally, since https://bugs.python.org/issue30916, the 3.x Windows build uses a binary external and needs even more patching to build and link against a debug and/or custom Tcl/Tk (too long to describe, can give a patch). |
This issue is effectively a re-opening and continuation of bpo-11077. TkinterCrash2(3)-2.py are altered versions of Serhiy's file of the same name on the old issue. (Ivan, you really should have said all this at the beginning of this issue.) Serhiy, you wrote "If there no other errors in TkinterCrash2-2.py, this issue can be closed." When I ran your file on Win10 with 2.7.15 multiple times, I got messages like one or more of the following about half the runs. Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Programs\Python27\lib\threading.py", line 801, in __bootstrap_inner
self.run()
File "f:/dev/tem/thread_event27.py", line 50, in run
self.deliverToQueue((self.target, z, y))
File "f:/dev/tem/thread_event27.py", line 133, in arrival_122
new_yz[1])
File "C:\Programs\Python27\lib\lib-tk\Tkinter.py", line 2322, in create_line
return self._create('line', args, kw)
File "C:\Programs\Python27\lib\lib-tk\Tkinter.py", line 2310, in _create
*(args + self._options(cnf, kw))))
TclError: unknown or ambiguous item type "87414752LtickHook" Exception in Tkinter callback This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Traceback (most recent call last):
File "C:\Programs\Python27\lib\lib-tk\Tkinter.py", line 1541, in __call__
return self.func(*args)
TypeError: callit() takes no arguments (6 given) |
7 years and counting... |
The 1st PR is ready. |
2nd fix ready. @DevTeam The issue is fixed completely now, the fix can be merged. Used the option "Hold the Tcl lock for the entire duration of a PythonCmd. To allow "downstream" calls, make the Tcl lock reentrant." from https://bugs.python.org/issue33257#msg316087 as it's the cleanest fix. @scott M, you can help here by doing a code review. |
Found yet another race condition -- when creating multiple Tk()s. Attached example (loops indefinitely inside Tcl: some internal list becomes circular for some reason). The obvious fix by wrapping Tcl calls with locks in Tkapp_New doesn't fix it -- must be something else. Since creatng multiple Tk's is an uncommon use case, this can wait. |
Serhiy, if you now think that we should try to fix _tkinter to actually support non-thread tcl builds, please review the two PRs on this issue. |
Serhiy, Given that 8.6 is compiled with thread support by default and that we (and most everybody) keeps the default and that a year ago only 1 *nix distribution was known to only supply tcl/tk 8.5, should be close this? |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: