This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Deadlock in logging
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: DaRoee, ncoghlan, pitrou, rhettinger, vinay.sajip
Priority: normal Keywords:

Created on 2021-04-25 16:58 by DaRoee, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
reproduction_test.py DaRoee, 2021-04-25 16:58 reproduction test, should run with pytest for proper reproduction
Messages (6)
msg391862 - (view) Author: (DaRoee) * Date: 2021-04-25 16:58
I've recently encountered a weird bug. To make a long story short, I’ve been using an external package to make sure my threads stop working in time, and appearently it’s been using ctypes.pythonapi.PyThreadState_SetAsyncExc.
There is a slight chance, that this code that is literally injecting an exception in the thread, will throw an exception in the handle function of the logger, right after the acquire. This will make the whole process deadlocked (only the thread with the exception will be able to continue logging) once the other threads try to log.

While I totally understand that this is not a casual use case, I think that the logger should be able to handle situations such as this...
The good news is that I’ve created a test (attached) that once you run it with pytest it’ll reproduce constantly (even though it’s a statistical bug), and the solution for this is relatively easy. Once we change the lock to use context manager the whole action is much more atomic, making the test pass constantly.

I’d be happy to help solve it, and replace locks to context manager usage throughout the file (or whatever the maintainers see fit for this).
msg391874 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-04-26 02:09
A few thoughts:

* We normally say "all bets are off" when it comes to manipulating Python internals with ctypes.  This issue is just one of many ways to wreak havoc.

* The programming pattern of killing running threads at arbitrary points in their execution is inherently fragile.  I don't think other modules are obliged to try to support it. 

* The pattern of "acquire(); try: something(); finally release();" is common.  Logging is only of many modules in the ecosystem that use this pattern.

* ISTM we can't assume that the lock is a context manager because createLock() is a public method and users can do anything with it.

* That said, we could add our own __enter__ and __exit__ methods to the Handler class.

* Lastly, I have a vague memory that at one point there was a rejected proposal to switch logging to use the with-statement throughout.  We should research that to see if the reasoning is still applicable or not.
msg391886 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2021-04-26 08:53
Feel free to submit a PR to use context managers, though I'm not sure it solves the problem in a guaranteed way.

Nick, what is the official word wrt. interruptibility of context managers?
msg391888 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2021-04-26 08:54
But also, please ask the "external package" that uses PyThreadState_SetAsyncExc to clear up its documentation and mentions the pitfalls.
msg392129 - (view) Author: (DaRoee) * Date: 2021-04-27 20:17
Ha! Already gave them my analysis, and stopped using that package since.
While I certainly think this is not the trivial, basic usage, and that the pattern used for the lock is common (and part of some PEP if I remember correctly), I'm kind of puzzled on that fact that context manager solves it. 
The test will pass constantly once the change is made (even though it looks like RLock __enter__ is just calling the acquire function), and I overall think this is a good change to do in the Handler class, even it's just for self.lock (can we find and understand why the last proposal was rejected?).

I'll submit a PR in a few days, but was wondering on the usage of the lock, and that it doesn't always exists on `self`. Anyway, I assume these are questions that we can continue discussing in the PR if you also think this is a good direction..
msg392158 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-04-28 00:32
> I'm kind of puzzled on that fact that context manager solves it. 

It "solves" the problem because of the happenstance rather than because of language guarantees.

The current implementation of the with-statement in CPython is to generate a SETUP_WITH opcode that calls both __enter__() and PyFrame_BlockSetup() within a single opcode while the GIL is held.  And because RLlock.__enter__() is implemented in C there are no pure python steps before the block frame setup.

If we added a pure Python __enter__ and __exit__ to the Handler class (necessary because a user can override createLock), then pure python steps would occur before the frame block setup and the problem would reappear.

If RLock were implemented in pure python, the problem would also reappear.

If the opcodes were changed so that the call to __enter__() were in a different opcode than setting up the block frame, the problem would reappear.


> I'll submit a PR in a few days

Let's wait to hear from Vinay and Nick before deciding to work on a PR.  Since you are no longer using the package, you no longer have the problem to solve.  The programming practice that gave rise to the problem is inherently fragile. The proposed mitigation only helps one module and doesn't solve the problem in general.  As described above, the proposed mitigation is fragile and implementation specific.  Lastly, it overrides an old decision to not use context managers in the logging module for reasons I no longer remember.
History
Date User Action Args
2022-04-11 14:59:44adminsetgithub: 88105
2021-04-28 00:32:18rhettingersetassignee: vinay.sajip
messages: + msg392158
2021-04-27 20:17:00DaRoeesetmessages: + msg392129
2021-04-26 08:54:48pitrousetmessages: + msg391888
2021-04-26 08:53:51pitrousetnosy: + ncoghlan
messages: + msg391886
2021-04-26 02:09:05rhettingersetnosy: + rhettinger, pitrou

messages: + msg391874
versions: - Python 3.6, Python 3.7, Python 3.8, Python 3.9
2021-04-25 22:09:46DaRoeesetnosy: + vinay.sajip
2021-04-25 16:58:46DaRoeecreate