Thread Safe Py_AddPendingCall #48543

kristjanvalur · 2008-11-10T11:36:35Z

BPO	4293
Nosy	@amauryfa, @mdickinson, @pitrou, @kristjanvalur, @tiran
Files	pendingalls.patch pendingalls.patch: version 3 of the patch pendingalls.patch: avoid deadlocks on the main thread only. pendingalls.patch: new complete patch, checks for NULL pending_lock pendingalls.patch: new version that doesn't assume signals on the main thread xmlrpc.patch: Some fixes suggested by Amaury testcapi.patch: test cases for Py_AddPendingCall() testcapi.patch: testing the asynchronous notify, with suggested changes pendingalls.patch: full patch including revised documentation

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2009-01-20.12:02:07.568>
created_at = <Date 2008-11-10.11:36:34.958>
labels = ['interpreter-core', 'type-crash']
title = 'Thread Safe Py_AddPendingCall'
updated_at = <Date 2009-01-20.12:02:07.567>
user = 'https://github.com/kristjanvalur'

bugs.python.org fields:

activity = <Date 2009-01-20.12:02:07.567>
actor = 'mark.dickinson'
assignee = 'none'
closed = True
closed_date = <Date 2009-01-20.12:02:07.568>
closer = 'mark.dickinson'
components = ['Interpreter Core']
creation = <Date 2008-11-10.11:36:34.958>
creator = 'kristjan.jonsson'
dependencies = []
files = ['11978', '12008', '12031', '12032', '12035', '12594', '12596', '12619', '12624']
hgrepos = []
issue_num = 4293
keywords = ['patch']
message_count = 32.0
messages = ['75691', '75692', '75693', '75700', '75782', '75783', '75872', '75963', '75966', '75970', '75977', '79113', '79119', '79128', '79132', '79142', '79147', '79151', '79152', '79241', '79255', '79269', '79270', '79290', '79504', '79507', '79511', '79658', '79975', '79982', '80091', '80250']
nosy_count = 5.0
nosy_names = ['amaury.forgeotdarc', 'mark.dickinson', 'pitrou', 'kristjan.jonsson', 'christian.heimes']
pr_nums = []
priority = 'normal'
resolution = 'accepted'
stage = 'patch review'
status = 'closed'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue4293'
versions = ['Python 3.1', 'Python 2.7']

kristjanvalur · 2008-11-10T11:36:33Z

At CCP We have started using the Py_AddPendingCall() mechanism to
signal python about a completed IO operation.
However, we noticed that the existing mechanism was hoplelessly un-
thread safe. This is bad, since on Windows at least, it is very
convenient to have such callbacks happen on an arbitrary thread from
the system's thread pool.
I submit a thread-safe implementation instead to be used if WITH_THREAD
is defined.
This allows Py_AddPendingCall() to be called from any thread, from any
context, even from a PendingCall callback itself.

tiran · 2008-11-10T11:53:27Z

Your patch sounds like a useful addition. However I can comment on the
details because I haven't dealt with Py_AddPendingCall() yet.

Documentation updates and a unit test (Modules/_testcapimodule.c), that
shows the bug and verifies your patch, are welcome, too.

kristjanvalur · 2008-11-10T11:59:26Z

Good point. I'll make a test using ctypes and _testcapimodule.c to try
to make it as non-platform-specific as possible.

kristjanvalur · 2008-11-10T14:05:08Z

Okay, i add a new patch that includes tests.
I am unsure where I should add documentation for the Py_AddPendingCall
() function. Any suggestions?

amauryfa · 2008-11-12T10:09:05Z

Py_AddPendingCall is used inside signal handlers.
It seems that there is a (tiny) chance that a signal interrupts the
program inside Py_MakePendingCalls, while the pendinglock is held.
Py_AddPendingCall would try to acquire this lock again and deadlock.

kristjanvalur · 2008-11-12T10:13:10Z

Right.
Isn't the best way to avoid it to block signals just while we pop the
queue? I'll see if python has a portable sigaction thing to do that.
Otherwise we'd have to start to mess with a "volatile busy" flag again
and possibly introduce tricky race conditions.

kristjanvalur · 2008-11-14T16:59:45Z

Here is a revised version.
Since there is no portable way to block signals, and no way at all on
windows (apparently) the simplest way is to simply use NOWAIT_LOCK when
adding a new pending call. While this does not guarantee that we are
always able to append a call (we may be unlucky in all our N tries and
there is also no portable way to yield the thread timeslice between
tries) such are also the semantics of the method. Sometimes the buffer
may be full, or we fail to get the lock in time, and a -1 is returned.
It is up to the caller to respond appropriately, perhaps by trying again
later.

kristjanvalur · 2008-11-17T14:52:38Z

Small refinement: Deadlock avoidance is only needed on the main thread
since we only install signal handlers for the main thread.

amauryfa · 2008-11-17T15:54:35Z

Two comments about the patch:

typo in a comment: "...change chat a signal..."
PyThread_acquire_lock should not be called with a NULL lock: on win32
this does not matter (there is a test in thread_nt.h), but Unix
segfaults in thread_pthread.h if you hit Ctrl-C.

kristjanvalur · 2008-11-17T16:19:29Z

I had forgotten that locks aren't initialized until PyEval_InitThreads
() is called. Now we check for this case. Also reinitialize when fork
() is called.
Is there any suggested place where I should add documentation to this?

kristjanvalur · 2008-11-17T19:05:33Z

I turns out that for SIGINT, windows creates a new thread. So, the
assumption that signals are delivered on the main thread are not valid
in general. Therefore, we must initialize the lock pending_lock
independently of PyEval_InitThreads(), and also cannot assume that
signals are deliverd on the main thread.

kristjanvalur · 2009-01-05T09:27:14Z

Ping.
Amaury, any thoughts?

amauryfa · 2009-01-05T10:29:06Z

The things_to_do static variable should be declared as "volatile": it
is read by the main loop without any lock.
(by the way, could you rename it to something like "pendingcalls_to_do"?)
in the old Py_MakePendingCalls function, the "#ifdef WITH_THREAD" part
is not useful now.
the "XXX" comments should probably be rewritten.

A note to people concerned by performance: the additional lock is used
only when there is an actual pending call to process. The main loop only
regularly checks the "things_to_do static" volatile static variable, and
the patch does not change this.

kristjanvalur · 2009-01-05T11:19:51Z

Good points.
Any suggestion where to document the new functionality?

amauryfa · 2009-01-05T12:11:44Z

Well, I could not find these functions documented anywhere :-(
If you want to start to write some, a good place could be in
http://docs.python.org/c-api/init.html (Doc/c-api/init.rst, maybe after
the PyGILState_ functions)

At the very least, a short sentence in whatsnew/2.7.rst would be
appreciated.

kristjanvalur · 2009-01-05T13:44:02Z

Ok, here is a revised, patch.
It now contains rudimentary documentation, which probably needs to be
proofread by the official documentation lords.

pitrou · 2009-01-05T13:59:08Z

Would it be possible to rework your code so as not to define two
versions of Py_AddPendingCall() and Py_MakePendingCalls(), only putting
#ifdef WITH_THREAD sections where it's necessary? Or would it make the
code too quirky?

Also, it would be nice to have a test for this specific feature (e.g.
using a dedicated function in the _testcapi module), but perhaps it
isn't a showstopper either.

kristjanvalur · 2009-01-05T14:21:08Z

Ah, I forgot the test code in my patch. I submit a separate patch to
add that code.
Also, reworking things to have single function definitions is in my
opinion too messy. I started doing things that way, but the code
becomes hard to read and therefore to understand and verify.

pitrou · 2009-01-05T14:56:17Z

A few comments about the test:

it should be a method of TestCAPI; we try to unittest everywhere,
although there is some old code which predates that
the implementation would be stressed better if each callback was added
from a separate thread, rather than adding them all at once; it will
also make the C function in _testcapi simpler (and you can do the sleep
in the Python code in test_capi.py)
if you want the list operation in your callback to be atomic, it
should append() something to the list rather than increment its first
element; then you just have to test for the len() of the list
in _pending_callback(), you must use Py_XDECREF(r), not Py_DECREF,
since r can be NULL

I will look at the ceval part of the patch later :)

kristjanvalur · 2009-01-06T09:31:29Z

Thanks.
What do you mean by making the test a method of TestCAPI? I have found
no such class. Are you talking about the python part or the c part?

Also, atomicity in the callback is not required, as the callback is
never interrupted implicitly. In fact, that would make it quite
useless. I'll add that to the documentation.

As for your other suggestions, they're noted, I'll try having more
threads.

kristjanvalur · 2009-01-06T13:11:14Z

Ok, I have addressed the issues raised about testcapi. A new file has
been submitted.

pitrou · 2009-01-06T16:06:02Z

Ok, the test patch is fine!

Now looking at the main patch, some comments:

you removed "volatile" from pendingfirst and pendinglast, is it really
safe?
what if pending_lock is not initialized when Py_AddPendingCall is
called? I know it's unlikely, but imagine someone calling that function
just after the interpreter has been initialized
the documentation should mention that the GIL must not be held when
Py_AddPendingCall is called, otherwise there can be a deadlock (is it a
change from previous versions by the way?)
you should check the return value of PyThread_allocate_lock() for NULL
is your implementation reentrant? that is, registering a callback from
a callback. There is no need for it to be, but if it aims to be, then it
should perhaps be documented and tested :)

pitrou · 2009-01-06T16:12:48Z

Sorry, forget what I said about the GIL, it looks fine. Perhaps some
comments should be added to clarify that matter, however.

kristjanvalur · 2009-01-06T21:42:16Z

volatile is not required when access is guarded by locks.
pendingcalls_to_do is initialized to 1, so the very first thing that
the interpreter does (after executing one bytecode, I init _PyTicker to
0 now) is to initialize this lock. Creating a lock really must be done
by one well known startup thread only. There is a lot of functions in
the python API that don't work until after python initialization
(without that being explicitly documented), Py_AddPendingCall() is one
of them. Btw, there is no generic "init" function for the ceval.c
module. And PyEval_InitThreads isn't really suitable since we must
have the lock, even if we haven't initialized the threads, to guard us
against reentrancy during signal delivery. For extra safety, I have
added a NULL test in Py_AddPendingCall().
3)Py_AddPendingCall() can be called with our without the GIL.
4)I've added a check in Py_MakePendingCalls(). In PyEval_ReInitThreads
there is no way to do anything sensible in the case of error.
5)Yes it is reentrant. The pending_lock is only held for a short while
while we pop a callback off the queue. However, because even during
that short while a signal can be delivered which causes
Py_AddPendingCall() to be called, the latter function fails if it
doesn't acquire the lock in a reasonable amount of time. This would
cause a signal to be lost, and not a deadlock. This hasn't changed
from the old implementation.

I have also added a flag to make sure that we don't execute pending
calls recursively, (async notifications is so much better) and
explained it better in the docs.

kristjanvalur · 2009-01-09T20:31:52Z

Checked in as revision: 68460

pitrou · 2009-01-09T21:12:19Z

Kristján Valur Jónsson <kristjan@ccpgames.com> added the comment:

Checked in as revision: 68460

Looks like you forgot the unit tests!

kristjanvalur · 2009-01-09T21:36:22Z

I indeed forgot the unittests and docs.
They are now in, in revision: 68461

kristjanvalur · 2009-01-12T09:24:34Z

Added MISC/News entry in revision: 68545

mdickinson · 2009-01-16T20:44:06Z

Kristján,

The r68461 checkin seems to be causing a number of the buildbots to fail,
typically with output like:

test_capi
test test_capi failed -- Traceback (most recent call last):
  File "/home/pybot/buildarea/trunk.klose-debian-ppc/build/Lib/test/test_capi.py", line 57, in 
test_pendingcalls_threaded
    self.pendingcalls_wait(l, n)
  File "/home/pybot/buildarea/trunk.klose-debian-ppc/build/Lib/test/test_capi.py", line 42, in 
pendingcalls_wait
    "timeout waiting for %i callbacks, got %i"%(n, len(l)))
AssertionError: timeout waiting for 32 callbacks, got 23

Please could you look into this?

Also, I don't understand the code:

for i in xrange(1000):
    a = i*i

in pendingcalls_wait(), in test_capi.py. Whatever you're
trying to do here, surely there's a better way?

mdickinson · 2009-01-16T22:31:51Z

How about using a timer instead of the 'count += 1' loop: after starting
the 32 self.pendingcalls_submit threads, set up a threading.Event and
start yet another thread that simply does a time.sleep(5.0) (or whatever)
and then sets that event.

Then your waiting loop could be something like:

while not self.my_event.is_set():
    for i in range(1000):
        a = i*i
self.assertEqual(len(l), n)

There's probably a better way, but that's the best I can come up with this
late on a Friday night...

kristjanvalur · 2009-01-18T10:59:19Z

checked in r68722, let's hope this fixes things.

mdickinson · 2009-01-20T12:02:04Z

That seems to have done the trick. Thanks!

kristjanvalur mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Nov 10, 2008

kristjanvalur mannequin closed this as completed Jan 9, 2009

mdickinson reopened this Jan 16, 2009

mdickinson closed this as completed Jan 20, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread Safe Py_AddPendingCall #48543

Thread Safe Py_AddPendingCall #48543

kristjanvalur mannequin commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

tiran commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

amauryfa commented Nov 12, 2008

kristjanvalur mannequin commented Nov 12, 2008

kristjanvalur mannequin commented Nov 14, 2008

kristjanvalur mannequin commented Nov 17, 2008

amauryfa commented Nov 17, 2008

kristjanvalur mannequin commented Nov 17, 2008

kristjanvalur mannequin commented Nov 17, 2008

kristjanvalur mannequin commented Jan 5, 2009

amauryfa commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

amauryfa commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

pitrou commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

pitrou commented Jan 5, 2009

kristjanvalur mannequin commented Jan 6, 2009

kristjanvalur mannequin commented Jan 6, 2009

pitrou commented Jan 6, 2009

pitrou commented Jan 6, 2009

kristjanvalur mannequin commented Jan 6, 2009

kristjanvalur mannequin commented Jan 9, 2009

pitrou commented Jan 9, 2009

kristjanvalur mannequin commented Jan 9, 2009

kristjanvalur mannequin commented Jan 12, 2009

mdickinson commented Jan 16, 2009

mdickinson commented Jan 16, 2009

kristjanvalur mannequin commented Jan 18, 2009

mdickinson commented Jan 20, 2009

Thread Safe Py_AddPendingCall #48543

Thread Safe Py_AddPendingCall #48543

Comments

kristjanvalur mannequin commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

tiran commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

kristjanvalur mannequin commented Nov 10, 2008

amauryfa commented Nov 12, 2008

kristjanvalur mannequin commented Nov 12, 2008

kristjanvalur mannequin commented Nov 14, 2008

kristjanvalur mannequin commented Nov 17, 2008

amauryfa commented Nov 17, 2008

kristjanvalur mannequin commented Nov 17, 2008

kristjanvalur mannequin commented Nov 17, 2008

kristjanvalur mannequin commented Jan 5, 2009

amauryfa commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

amauryfa commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

pitrou commented Jan 5, 2009

kristjanvalur mannequin commented Jan 5, 2009

pitrou commented Jan 5, 2009

kristjanvalur mannequin commented Jan 6, 2009

kristjanvalur mannequin commented Jan 6, 2009

pitrou commented Jan 6, 2009

pitrou commented Jan 6, 2009

kristjanvalur mannequin commented Jan 6, 2009

kristjanvalur mannequin commented Jan 9, 2009

pitrou commented Jan 9, 2009

kristjanvalur mannequin commented Jan 9, 2009

kristjanvalur mannequin commented Jan 12, 2009

mdickinson commented Jan 16, 2009

mdickinson commented Jan 16, 2009

kristjanvalur mannequin commented Jan 18, 2009

mdickinson commented Jan 20, 2009