classification
Title: Add threading.Barrier
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: accepted
Dependencies: 10218 Superseder:
Assigned To: Nosy List: georg.brandl, giampaolo.rodola, jyasskin, kristjan.jonsson
Priority: normal Keywords: patch

Created on 2010-05-20 17:07 by kristjan.jonsson, last changed 2011-05-26 16:46 by stutzbach. This issue is now closed.

Files
File name Uploaded Description Edit
barrier.patch kristjan.jonsson, 2010-05-20 17:07
barrier3.patch kristjan.jonsson, 2010-10-27 01:06 Patch for py3k
barrier4.patch kristjan.jonsson, 2010-10-28 08:47
barrier4.patch kristjan.jonsson, 2010-10-28 09:28
Messages (15)
msg106167 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-05-20 17:07
The "barrier" synchronization primitive is often very useful.  It is simpler to use than an Event, for example, when waiting for threads to start up, or to finish.
The patch contains a simple barrier implementation based on a Condition variable, for your perusal.

See http://en.wikipedia.org/wiki/Barrier_(computer_science) for info.

This particular implementation contains an important feature:  The ability to adjust the 'count' of the barrier.  This is useful in case a thread dies for some reason, to avoid a deadlock for the other threads.

There is still no documentation, since this is only a proposal, but there is a unittest.
msg106375 - (view) Author: Jeffrey Yasskin (jyasskin) * (Python committer) Date: 2010-05-24 17:30
You should probably mention that pthread_barrier and java.util.concurrent.CyclicBarrier are prior art for this. I'm thinking about them when looking at the API to see whether your differences make sense.

"enter" seems to be the wrong term for this, since there's no matching "exit" call. "wait" or "block" seem better.

Both pthread_barrier and CyclicBarrier provide a way to identify a unique thread from each group. pthread_barrier_wait returns true in exactly one thread, while CyclicBarrier runs a callback while all other threads are still paused. I'd be inclined to use CyclicBarrier's interface here, although then you have to define what happens when the action raises an exception.

_release should notify_all after its loop.

adjust_count makes me nervous. Is there a context manager interface that would make this cleaner/safer?
msg106446 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-05-25 16:45
I'll provide a new version shortly, targeted for the py3k branch.
msg119454 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-23 17:36
Ping -- is this something you want in 3.2, Kristjan?
msg119500 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-24 08:58
Hi, I had forgotten about this.
I went back to the drawing board and had almost completed a new version.  Looking at the Java barrier shows how one can go overboard with stuff.  My though with the barrier is to provide a simple synchronization primitive that works well for example in the unittests, without trying too hard to over design it in terms of failure modes.

Anyway, I´ll get my act together.  Same with RWLock, that is almost ready
msg119667 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-27 01:06
Okay, here is a new submission.
I've redesigned it to be more reminiscent of the Java version, by allowing the barrier to have a "Broken" state and raising a BrokenBarrierError.
I've also redesigned the mechanism from a simple perpetually increasing index of "entered" and "released" into a proper two-state machine which is either "filling" or "draining".

There is also a rather comprehensive set of tests.

What is missing is documentation, somethign I shall add if this gets a positive response.

Note how, in the tests, I sometimes create a "barrier2" object to facilitate external synchronization.  This demonstrates the simplicity of using this primitive.

Another note:  In order to implement "timeout" behaviour, I changed Condition.wait() to return True in case it returns due to a timeout occurring.  I folded this into this patch, but if such a change is not accepted, or we want it separately, then I'll have to remove the timeout functionality from the Barrier.  I don't want to have complicated logic in there to measure time.  Also, I do think that locking primitives that time out should be able to provide an indication to that fact to their callers, so condition.wait() really should do that.
msg119756 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-28 06:26
ping?
msg119759 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-28 06:43
The tests pass for me, and the patch looks good except for a stray change to Condition objects.
msg119760 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-28 06:54
Right.  The condition object change is necessary to have timeout work.  I can remove that feature, and slate it for another day.  Add a separate patch for a Condition.wait() return value.  All of the other apis are able to let the caller know whether a timeout occurred or not, I think Condition.wait() should do the same.

Actually, I can fudge the timeout with time.clock(), which is good enough.

I'll write up some docs.
msg119761 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-28 06:57
Well, that change would be fine by me, it was just not explained anywhere in the patch.  So if it's going to be documented (with versionchanged etc.), just leave it in.
msg119765 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-28 08:47
Here is an updated patch.  It contains documentation.
ReStructured isn't my Forte, and I don't know how to verify that it is correct, so please review it for me.
msg119766 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-28 09:00
Two comments:

* The return value of wait() isn't documented well.  What is the significance of the returned index, i.e. what does distinguish it from a randomly selected one in range(parties)?

* get_parties() and is_broken() should be properties (waiting, broken), to be consistent with other threading APIs (Thread.name etc). get_waiting() does "real" work (can it block?) and should remain a method.

Don't worry about markup errors, I review doc changes routinely after they are committed.

*
msg119768 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-28 09:28
Right.
I've provided more text for the return value and provided an example.
I´ve changed all three to properties, the locking wasn't really required for waiting().
I added some extra tests for the properties.
msg119769 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-28 09:36
Looks good to me now, I think you can commit it.
msg119770 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2010-10-28 09:45
Committed as revision 85878
History
Date User Action Args
2011-05-26 16:46:31stutzbachsetstage: resolved
2010-10-28 09:45:05kristjan.jonssonsetstatus: open -> closed
resolution: accepted
messages: + msg119770
2010-10-28 09:36:21georg.brandlsetmessages: + msg119769
2010-10-28 09:28:13kristjan.jonssonsetfiles: + barrier4.patch

messages: + msg119768
2010-10-28 09:00:50georg.brandlsetmessages: + msg119766
2010-10-28 08:47:53kristjan.jonssonsetfiles: + barrier4.patch

dependencies: + Add a return value to threading.Condition.wait()
messages: + msg119765
2010-10-28 06:57:24georg.brandlsetmessages: + msg119761
2010-10-28 06:54:49kristjan.jonssonsetmessages: + msg119760
2010-10-28 06:43:40georg.brandlsetmessages: + msg119759
2010-10-28 06:26:23kristjan.jonssonsetmessages: + msg119756
2010-10-27 01:06:22kristjan.jonssonsetfiles: + barrier3.patch

messages: + msg119667
2010-10-24 08:58:25kristjan.jonssonsetmessages: + msg119500
2010-10-23 17:36:26georg.brandlsetnosy: + georg.brandl
messages: + msg119454
2010-05-25 16:45:52kristjan.jonssonsetkeywords: patch, patch

messages: + msg106446
2010-05-24 23:07:07giampaolo.rodolasetkeywords: patch, patch
nosy: + giampaolo.rodola
2010-05-24 17:30:54jyasskinsetkeywords: patch, patch

messages: + msg106375
2010-05-24 16:44:16pitrousetkeywords: patch, patch
nosy: + jyasskin
2010-05-20 17:08:33brian.curtinsetkeywords: patch, patch
versions: + Python 3.2, - Python 2.7
2010-05-20 17:07:28kristjan.jonssoncreate