msg183811 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-03-09 11:42 |
It should be possible to optimize threading.Event.wait() to not acquire the condition when the event is already set. Patch attached.
Without patch:
$ ./python -m timeit -s "import threading; e = threading.Event(); e.set()" "e.wait()"
1000000 loops, best of 3: 0.466 usec per loop
With patch:
$ ./python -m timeit -s "import threading; e = threading.Event(); e.set()" "e.wait()"
10000000 loops, best of 3: 0.19 usec per loop
|
msg183871 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2013-03-10 11:14 |
I think there's a race condition in the patched version: 'self._flag' could be set in between if 'if not signaled:' and the 'self._cond.acquire()', and in that case you'll end up waiting for an event that's already occurred. Do you need to recheck 'self._flag' after acquiring the condition?
|
msg183872 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2013-03-10 11:26 |
How about the attached simpler patch?
|
msg183873 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-03-10 11:37 |
> How about the attached simpler patch?
Looks better indeed! I had completely overlooked the possible race
condition.
|
msg184972 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-03-22 15:19 |
Something bothers me:
"""
def wait(self, timeout=None):
if self._flag:
return True
self._cond.acquire()
"""
The _flag is checked without any lock held: although it won't be a
problem with CPython, a standard memory model (e.g. Java's one)
doesn't guarantee that reading _flag outside of the lock will return
the value most recently written to it (because of caching/hoisting, or
store buffers/invalidate queues at CPU level).
So in short, if wait() is called by a thread shortly after another
thread clear()ed it, the former thread might very well read _flag ==
True (while the later just set it to False) and return erroneously.
Now, it's probably being pedantic, especially because we lack a memory
model, but that bothers me.
Also, I'm not sure this is really a hot-path.
|
msg184973 - (view) |
Author: Richard Oudkerk (sbt) * |
Date: 2013-03-22 15:26 |
On 22/03/2013 3:19pm, Charles-François Natali wrote:
> The _flag is checked without any lock held: although it won't be a
> problem with CPython, a standard memory model (e.g. Java's one)
> doesn't guarantee that reading _flag outside of the lock will return
> the value most recently written to it (because of caching/hoisting, or
> store buffers/invalidate queues at CPU level).
>
> So in short, if wait() is called by a thread shortly after another
> thread clear()ed it, the former thread might very well read _flag ==
> True (while the later just set it to False) and return erroneously.
I was under the impression that dict access (and therefore attribute
access for "simple" objects) was guaranteed to be atomic even in
alternative implementations like Jython and IronPython.
Is this not a language guarantee?
|
msg184975 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-03-22 15:31 |
> I was under the impression that dict access (and therefore attribute
> access for "simple" objects) was guaranteed to be atomic even in
> alternative implementations like Jython and IronPython.
>
> Is this not a language guarantee?
AFAICT there's no such guarantee.
But even if dict access/assignment was guaranteed to be atomic, this
wouldn't solve the visibility issue.
|
msg184977 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-03-22 15:48 |
> So in short, if wait() is called by a thread shortly after another
> thread clear()ed it, the former thread might very well read _flag ==
> True (while the later just set it to False) and return erroneously.
Would that be erroneous? It can already happen because of thread switches:
e.g. wait() reads True from the flag, then releases the internal lock;
another thread gets scheduled in, takes the lock and resets the event;
the original thread is resumed and the calling function sees True as
the wait() return value even though the event was reset in-between.
> Now, it's probably being pedantic, especially because we lack a
> memory
> model, but that bothers me.
As for the memory model, Jeffrey had written a PEP draft years ago:
http://code.google.com/p/unladen-swallow/wiki/MemoryModel
Unfortunately I can't find the corresponding reference in the python-dev
archives again.
That said, I agree the performance improvement is probably not important
here. It was just drive-by optimization on my part :-)
|
msg184996 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2013-03-22 18:41 |
Charles-François: would your objections also apply to the current implementation of 'is_set'?
|
msg185000 - (view) |
Author: Richard Oudkerk (sbt) * |
Date: 2013-03-22 19:33 |
On 22/03/2013 3:31pm, Charles-François Natali wrote:
>> I was under the impression that dict access (and therefore attribute
>> access for "simple" objects) was guaranteed to be atomic even in
>> alternative implementations like Jython and IronPython.
>>
>> Is this not a language guarantee?
>
> AFAICT there's no such guarantee.
> But even if dict access/assignment was guaranteed to be atomic, this
> wouldn't solve the visibility issue.
"Atomic" was maybe the wrong word.
For implementations with a GIL the GIL ensures there are no visibility
issues.
According to Antoine's link, Jython uses ConcurrentHashMap for its
dicts, and IronPython uses a locked map. So in those cases also there
are no visibility issues.
|
msg185003 - (view) |
Author: Charles-François Natali (neologix) * |
Date: 2013-03-22 19:45 |
> Would that be erroneous? It can already happen because of thread switches:
That's not really the same thing. wait() would return True, which is
the right thing to do since the Event has been set at some point.
Here, it would make it possible for wait() to return immediately after
another thread clear()ed the Event, which would be just wrong.
> Charles-François: would your objections also apply to the current implementation of 'is_set'?
Yes (in OpenJDK CountDownLatch implementation, a similar method uses a
volatile variable to guarantee memory visibility.)
> For implementations with a GIL the GIL ensures there are no visibility
> issues.
Agreed.
> According to Antoine's link, Jython uses ConcurrentHashMap for its
dicts, and IronPython uses a locked map. So in those cases also there
are no visibility issues.
Agreed.
But the question remains whether that's something which is part of the
memory model: I'm skeptical about this, since it would pretty much
imply memory barriers everywhere and disable a whole class of
caching/hoisting, which would just kill performance.
|
msg185429 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2013-03-28 10:42 |
[Antoine]
> That said, I agree the performance improvement is probably not important
> here. It was just drive-by optimization on my part :-)
Shall we close this as rejected?
|
msg185439 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-03-28 12:10 |
Agreed.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:42 | admin | set | github: 61591 |
2013-03-28 12:10:37 | pitrou | set | status: open -> closed resolution: rejected messages:
+ msg185439
|
2013-03-28 10:42:45 | mark.dickinson | set | messages:
+ msg185429 |
2013-03-22 19:45:47 | neologix | set | messages:
+ msg185003 |
2013-03-22 19:33:07 | sbt | set | messages:
+ msg185000 |
2013-03-22 18:41:34 | mark.dickinson | set | messages:
+ msg184996 |
2013-03-22 15:48:26 | pitrou | set | nosy:
+ jyasskin messages:
+ msg184977
|
2013-03-22 15:31:53 | neologix | set | messages:
+ msg184975 |
2013-03-22 15:26:11 | sbt | set | messages:
+ msg184973 |
2013-03-22 15:19:40 | neologix | set | messages:
+ msg184972 |
2013-03-10 15:19:49 | jcea | set | nosy:
+ jcea
|
2013-03-10 11:37:57 | pitrou | set | messages:
+ msg183873 |
2013-03-10 11:26:07 | mark.dickinson | set | files:
+ event_wait_metd.patch
messages:
+ msg183872 |
2013-03-10 11:14:31 | mark.dickinson | set | messages:
+ msg183871 |
2013-03-10 11:12:20 | mark.dickinson | set | nosy:
+ mark.dickinson
|
2013-03-09 11:42:27 | pitrou | create | |