classification
Title: Fix emulated lock to be 'fair'
Type: behavior Stage:
Components: Interpreter Core, macOS Versions: Python 3.2
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: dabeaz, kristjan.jonsson, loewis, navytux, pitrou, ronaldoussoren, ysj.ray
Priority: normal Keywords: needs review, patch

Created on 2010-04-15 20:00 by kristjan.jonsson, last changed 2019-09-11 16:42 by navytux. This issue is now closed.

Files
File name Uploaded Description Edit
pthread_lock.patch kristjan.jonsson, 2010-04-15 20:00 review
Messages (15)
msg103247 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-04-15 20:00
On pthreads plaforms, if the posix_sem functions aren't available (when _POSIX_SEMAPHORES isn't defined), the python lock is implemented with a mutex and a condition variable.  This appears to be the case on Mac OS X, for example.
The problem is that this lock does not provide fairness to threads trying to acquire it.  This makes it a very poor candidate to implement the GIL, which is a lock that is held all the time, and all the threads are in constant competition to claim.
Other implementations of the python lock, such as the posix_sem based ones (linux) and Event based (MS Windows) provide fairness through that underlying synchronization primitive.

This patch fixes the "emulated semaphore" by making all competing threads wait their turn in the condition variable, thus enjoying whatever fairness that primitive provides.

I've not tested this patch for compilation since I don't have access to a mac to do that.  But the mechanism has been tested successfully on windows.

See also issue 8299, where this has been discussed at length.
msg103249 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-15 20:32
> On pthreads plaforms, if the posix_sem functions aren't available
> (when _POSIX_SEMAPHORES isn't defined), the python lock is
> implemented with a mutex and a condition variable.  This appears to
> be the case on Mac OS X, for example. The problem is that this lock
> does not provide fairness to threads trying to acquire it. 

Why do you think condition variables don't provide fairness?
msg103271 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-15 23:09
I don't understand why you claim your patched version is fair. As far as I can tell, if you have three threads A, B and C all routinely trying to take this lock, your implementation can perfectly well bounce between A and B without ever giving the lock to C.
msg103313 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-04-16 11:44
Martin, it isn't the condition variable which is unfair, but how the constructed locking apparatus is unfair that is the problem.  This lock is written by a Tim, whom I assume to be Tim Peters.  I quote his comment:  
"In general, if the bit can be acquired instantly, it is, else the pair is used to block the thread until the bit is cleared.     9 May 1994 tim@ksr.com"

Herein lies the problem.  This is the behaviour of "greedy" or "unfair" mutexes, not that of "fair" semaphores.  The lock is made 'free' and the just signaled thread has to _race_ to acquire it.

Since the exact mechanics seem to be unclair to many, let me just step you through the series of events.
1) A has the lock, B is waiting for it.  the bit is "set".
2) A releases the lock:  Clears the bit, signals the condition variable.
3) A tries to immediately reacquire the lock.  Sees the bit cleared, sets it, and claims the lock.
4) B wakes up from the condition variable.  Sees the bit "set" and goes back to sleep.  It has lost the race to A.

If the lock were based on a semaphore, the events would be different:
1) A has the semaphore, B is waiting for it, sem.value == 0
2) A releases (signals) the semaphore.  B is made runnable. sem.value stays at 0
3) A tries to immediately reacquire the lock.  Finds the sem.value at 0 and blocks.
4) B wakes up and executes, now the owner of the lock. sem.value stays at 0.

This particular patch implements the latter behaviour by explicitly entering a "handoff" period.  If you want, I can submit a different patch which emulates a semaphore perfectly.  perhaps that is easier to understand, since semaphores are very familiar to people.

The key difference between Tim's lock is that the semaphore "hands off" ownership to a particular waiting thread.  The semaphore doesn't enter a state of "free" so that thread have to race to lock it.  It is this race which is unfair and which the just-releasing lock almost always wins.

If you are asking "why would we want an unfair lock, then?", see issue 8299 where I point out some links that discuss unfair locks and their role in combating lock convoys.


Antoine, if we have A, B, and C, all competing for the lock, at any one point, only one of the three has the lock.  Say, A.  The others are waiting on the condition variable.
Condition variables are generally implemented in a "fair" manner, so that all threads that "wait" on it are woken up in a roughly FIFO manner.  If not exactly, then at least in the long run.  All of the threads have to enter the condition variable's wait queue to acquire the lock.  Because of this, the lock is handed off to the threads in roughly the same order that they enter the condition variable´s wait state.

If, in your example, C is waiting on the condition variable, then A and B, whenever they give up the lock, they will hand it off a single thread which is woken up, typcally the one at the head of the condition variable's internal queue.  If the condition variable is implemented properly, there is no way that A and B can just flip-flop without C never being the thread to be woken up next.

As so often, the proof is in the pudding.  Try your ccprof.py script with the do_yield turned off.
You can also implement an explicitly FIFO condition variable, as I did in issue 8299, if you don't trust the condition variable's own internal mechanism to treat its waiting threads fairly.
msg103532 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-04-18 21:04
W.r.t. "This appears to be the case on Mac OS X": OSX 10.4 and later seem to support posix semaphores, the program below prints "yes":

#include <unistd.h>

int main()
{
#ifdef _POSIX_SEMAPHORES
        printf("yes\n");
#else
        printf("no\n");
#endif
}
msg103536 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-04-18 22:01
Is it possible that unistd.h isn't included by Python on mac builds? perhaps the config script is broken and HAVE_UNISTD_H doesn't get defined.  I'll have a look at the generated pyconfig.h file on my colleague's machine tomorrow.
msg103574 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-04-19 10:46
Also, _POSIX_SEMAPHORES must be defined to be greater than 200112L.  If it isn't, then it isn't supported.
msg103580 - (view) Author: ysj.ray (ysj.ray) Date: 2010-04-19 11:40
Hi, 

krisvale:
> Since the exact mechanics seem to be unclair to many, let me just step you through the series of events.
> 1) A has the lock, B is waiting for it.  the bit is "set".
> 2) A releases the lock:  Clears the bit, signals the condition variable.
> 3) A tries to immediately reacquire the lock.  Sees the bit cleared, sets it, and claims the lock.
> 4) B wakes up from the condition variable.  Sees the bit "set" and goes back to sleep.  It has lost the race to A.

I don't quite understand the four steps you explained. After the time of step 2, B is going to waken up and acquire the lock, and at the same time A returns from release function and is going to reacquire the lock. Who is scheduled first after A signals the condition variable is not predictable. So why does A always acquire the lock?

Thanks!
msg103582 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-04-19 11:48
In 2), B is indeed signaled and the OS makes it "runnable".  But it doesn´t run immediately. A is still running.  There is no need for A to stop running until it runs out of timeslice.  Meanwhile the OS has to put B on a separate core (which makes this problem more significant on multicore), putting it at the end of the runnable queue where it has to percolate to the top for it to actually start execution some time later.

In effect, B has has to 'race' with A (and any other threads) to get the lock and since A is already running, it is likely to win the race.

Having threads 'race' to get synchronization primitives is a valid technique to reduce lock convoying problems, but it also can cause thread starvation, and is not approppriate when you want to ensure fair use of the resource.  To ensure fairness, lock "handoff" must be used.
msg103820 - (view) Author: David Beazley (dabeaz) Date: 2010-04-21 11:26
I know that multicore processors are all the rage right now, but one thing that concerns me about this patch is its effect on single-core systems.  If you apply this on a single-CPU, are threads just going to sit there and thrash as they rapidly context switch? (Something that does not occur now).

Also, I've done a few experiments and on a single-core Windows-XP machine, the GIL does not appear to have any kind of fairness to it (as previously claimed here).   Yet, if I run the same experiments on a dual-core PC, the GIL is suddenly fair.  So, somewhere in that lock implementation, it seems to adapt to the environment.  Do we have to try an emulate that behavior in Unix?   If so, how do you do it without it turning into a huge coding mess? 

I'll just mention that the extra context-switching introduced by fair-locking has a rather pronounced effect on performance that should be considered even on multicore.  I posted some benchmarks in Issue 8299 for Linux and OS-X.  In those benchmarks, the introduction of fair GIL locking makes CPU-bound threads run about 2-5 times slower than before on Linux and OS-X.
msg111360 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2010-07-23 16:38
It turns out that posix semaphores aren't supported on OSX.

The patch doesn't apply cleanly anymore, and that is not just because of whitespace issues (the patch contains tabs while the tree no longer does).  

The chunk that affects 'PyThread_acquire_lock_timed' doesn't apply even with 'patch -l' (which ignores whitespace).

I'll try to update the patch, but I'm not that well versed in pthread so that might take a while.
msg215512 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2014-04-04 11:57
Closing this issue.
It is largely superseded.  For our Python 2.7 branches, we have a custom "GIL" lock which can have different inherent semantics from the common "Lock".  In particular, we can implement a "fair" PyGIL_Handoff() function to be used to yield the GIL to a waiting thread.
msg351831 - (view) Author: Kirill Smelkov (navytux) * Date: 2019-09-11 11:36
At least condition variable signalling has to be moved to be done under mutex for correctness: https://bugs.python.org/issue38106.
msg351947 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2019-09-11 15:34
super, good catch!
msg351973 - (view) Author: Kirill Smelkov (navytux) * Date: 2019-09-11 16:42
Thanks for feedback.
History
Date User Action Args
2019-09-11 16:42:28navytuxsetmessages: + msg351973
2019-09-11 15:34:12kristjan.jonssonsetmessages: + msg351947
2019-09-11 11:36:58navytuxsetnosy: + navytux
messages: + msg351831
2014-04-04 11:57:51kristjan.jonssonsetresolution: rejected
2014-04-04 11:57:36kristjan.jonssonsetstatus: open -> closed

messages: + msg215512
2013-07-06 08:50:56ronaldoussorensetassignee: ronaldoussoren ->
2010-07-23 16:38:01ronaldoussorensetkeywords: patch, patch, needs review

messages: + msg111360
2010-04-21 11:26:46dabeazsetnosy: + dabeaz
messages: + msg103820
2010-04-19 11:48:49kristjan.jonssonsetkeywords: patch, patch, needs review

messages: + msg103582
2010-04-19 11:40:34ysj.raysetnosy: + ysj.ray
messages: + msg103580
2010-04-19 10:46:11kristjan.jonssonsetkeywords: patch, patch, needs review

messages: + msg103574
2010-04-18 22:01:16kristjan.jonssonsetkeywords: patch, patch, needs review

messages: + msg103536
2010-04-18 21:04:46ronaldoussorensetkeywords: patch, patch, needs review

messages: + msg103532
2010-04-16 11:44:04kristjan.jonssonsetkeywords: patch, patch, needs review

messages: + msg103313
2010-04-15 23:09:29pitrousetpriority: normal

type: behavior
versions: + Python 3.2, - Python 2.7
keywords: patch, patch, needs review
nosy: + pitrou

messages: + msg103271
2010-04-15 20:32:58loewissetnosy: + loewis
messages: + msg103249
2010-04-15 20:00:44kristjan.jonssoncreate