classification
Title: RLock's are SLOW
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pitrou Nosy List: Rhamphoryncus, giampaolo.rodola, gregory.p.smith, haypo, hgibson50, jcea, kevinwatters, pitrou, sserrano
Priority: normal Keywords: patch

Created on 2008-05-29 15:28 by jcea, last changed 2009-11-10 18:52 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
rlock-v2.patch haypo, 2008-08-20 14:06 Fixed C implementation of RLock
rlock2.patch pitrou, 2009-11-07 01:07
rlock3.patch pitrou, 2009-11-07 17:47
rlock4.patch pitrou, 2009-11-07 20:12
Messages (20)
msg67497 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2008-05-29 15:28
threading.RLock acquire/release is very slow. A order of magnitude
higher than no reentrant threading.Lock:

"""
def RLockSpeed() :
  import time, threading
  t=time.time()
  result={}
  for i in xrange(1000000) :
    pass
  result["empty loop"]=time.time()-t
  l=threading.Lock()
  t=time.time()
  for i in xrange(1000000) :
    l.acquire()
    l.release()
  result["Lock"]=time.time()-t
  l=threading.RLock()
  t=time.time()
  for i in xrange(1000000) :
    l.acquire()
    l.release()
  result["RLock"]=time.time()-t
  return result

if __name__=="__main__" :
  print RLockSpeed()
"""

Result:
{'empty loop': 0.074212074279785156, 'RLock': 10.144084215164185,
'Lock': 1.2489800453186035}
msg67703 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-04 21:33
You should investigate and try to diagnose where the speed difference
comes from. ISTM the RLock class is implemented in Python while the Lock
class is simply an alias to the builtin native lock type, which could
explain most of the discrepancy.
msg68512 - (view) Author: sebastian serrano (sserrano) Date: 2008-06-21 16:40
Running with python -O the timing gets a little closer between Lock and
RLock. This code won't be easy to improve in performance. 
The heaviest call is current_thread(), used at lines:
117:    me = current_thread()
137:    if self.__owner is not current_thread():

and only consist on:
788: def current_thread():
789:     try:
790:         return _active[_get_ident()]
791:     except KeyError:
792:         ##print "current_thread(): no current thread for", _get_ident()
793:         return _DummyThread()

Simple profiler dump:
$../python-trunk/python -O rlock.py 
time Lock 0.720541000366
time RLock 4.90231084824
         400004 function calls in 0.982 CPU seconds

   Ordered by: internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   100000    0.304    0.000    0.390    0.000 threading.py:116(acquire)
   100000    0.278    0.000    0.360    0.000 threading.py:136(release)
        1    0.232    0.232    0.982    0.982 rlock.py:27(testRLock)
   200000    0.168    0.000    0.168    0.000
threading.py:788(current_thread)
        1    0.000    0.000    0.000    0.000 threading.py:103(__init__)
        1    0.000    0.000    0.000    0.000 threading.py:98(RLock)
        1    0.000    0.000    0.000    0.000 threading.py:76(__init__)
        0    0.000             0.000          profile:0(profiler)
msg68536 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-06-21 19:22
Le samedi 21 juin 2008 à 16:40 +0000, sebastian serrano a écrit :
> sebastian serrano <sebastian@devsar.com> added the comment:
> 
> Running with python -O the timing gets a little closer between Lock and
> RLock. This code won't be easy to improve in performance. 
> The heaviest call is current_thread(), used at lines:
> 117:    me = current_thread()
> 137:    if self.__owner is not current_thread():

One could always try to rewrite RLock by replacing calls to
threading.current_thread() with thread.get_ident().

However, given the profile table you have appended, it will only save at
most 30% of the time. If someone needs a more important speed-up, he
should reimplement the RLock type in C (and contribute it back :-)).
msg71540 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-08-20 13:49
As suggested by pitrou, I wrote an implementation of RLock in C. 
Changes to Python version:
 - no debug message: i leave the message in #if 0 ... #endif
 - acquire() method argument "blocking" is not a keyword

Notes:
 - RLock has no docstring!
 - In the Python version, RLock._release_save() replaces owner and 
counter attributes before release the lock. But releasing the lock may 
fails and no the object is in an inconsistent state
msg71544 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-08-20 14:06
Oops, I forgot to update PyInit__Thread() with my new time:
 - Add PyType_Ready()
 - Register RLockType to threading dict

Here is the new patch.
msg71546 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-20 14:17
Wow, that was quick. Did you try to replace threading.RLock with your
implementation, and run the tests?

By the way:
> - acquire() method argument "blocking" is not a keyword
> - In the Python version, RLock._release_save() replaces owner and 
> counter attributes before release the lock. But releasing the lock may 
> fails and no the object is in an inconsistent state

Removing the debugging statements is fine, but apart from that the C
implementation should mimick the current behaviour. Even if this
behaviour has potential pitfalls.

Speaking of which, it would be nice if RLock was subclassable. I don't
know whether any software relies on this though.
msg71564 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-08-20 18:28
I doubt subclassability of RLock matters but who knows, people do code
things.

Regardless, using a C version wrapped in a simple python container class
that calls the underlying C implementation's methods should be
sufficient to allow sub-classing.

Given the final 2.6 beta is scheduled for today, this won't make it into
2.6/3.0 so we've got some time to polish up what we want.
msg71568 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-20 19:00
Gregory, would you have an advice on #3618?
msg74555 - (view) Author: Hugh Gibson (hgibson50) Date: 2008-10-09 05:39
> I doubt subclassability of RLock matters but who knows, people do code
> things.

I've recently done this to implement potential deadlock detection. I 
keep a record of the sequences of acquired locks, find unique 
sequences, then check for conflicts between each sequence. There's not 
much overhead and it highlighted some potential deadlocks where lock A 
and B were acquired AB in one route through code and BA in another 
route. The algorithm is a simplified version of that used in Linux - 
see http://www.mjmwired.net/kernel/Documentation/lockdep-design.txt

Hugh
msg95007 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 01:07
Here is a new patch against py3k. It passes all tests and is generally
10-15x faster than the pure Python version.
msg95012 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-11-07 07:48
Reviewers: ,

http://codereview.appspot.com/150055/diff/1/4
File Modules/_threadmodule.c (right):

http://codereview.appspot.com/150055/diff/1/4#newcode221
Modules/_threadmodule.c:221: return PyBool_FromLong((long) r);
This explicit (long) cast is unnecessary.

http://codereview.appspot.com/150055/diff/1/4#newcode246
Modules/_threadmodule.c:246: PyThread_release_lock(self->rlock_lock);
reset self->rlock_owner to 0 before releasing the lock.

Description:
code review for http://bugs.python.org/issue3001

Please review this at http://codereview.appspot.com/150055

Affected files:
   M     Lib/test/test_threading.py
   M     Lib/threading.py
   M     Modules/_threadmodule.c
msg95014 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-11-07 08:57
rlock_acquire_doc: "(...) return None once the lock is acquired".
That's wrong, acquire() always return a boolean (True or False).

rlock_release(): Is the assert(self->rlock_count > 0); really required?
You're checking its value few lines before.

rlock_acquire_restore(): raise an error if the lock acquire failed,
whereas rlock_acquire() doesn't. Why not raising an error in
rlock_acquire() (especially if blocking is True)? Or if the error can
never occur, remove the error checking in rlock_acquire_restore().

rlock_acquire_restore(): (maybe) set owner to 0 if count is 0.

rlock_release_save(): may also reset self->rlock_owner to 0, as
rlock_release().

rlock_new(): why not reusing type instead of Py_TYPE(self) in
"Py_TYPE(self)->tp_free(self)"?

__exit__: rlock_release() is defined as __exit__() with METH_VARARGS,
but it has no argument. Can it be a problem?

Anything, thanks for the faster RLock!

If your patch is commited, you can reconsider #3618 (possible deadlock
in python IO implementation).
msg95018 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 15:06
Thanks for the review. I will make the suggested modifications.

http://codereview.appspot.com/150055/diff/1/4
File Modules/_threadmodule.c (right):

http://codereview.appspot.com/150055/diff/1/4#newcode221
Modules/_threadmodule.c:221: return PyBool_FromLong((long) r);
On 2009/11/07 07:48:05, gregory.p.smith wrote:
> This explicit (long) cast is unnecessary.

Right.

http://codereview.appspot.com/150055/diff/1/4#newcode246
Modules/_threadmodule.c:246: PyThread_release_lock(self->rlock_lock);
On 2009/11/07 07:48:05, gregory.p.smith wrote:
> reset self->rlock_owner to 0 before releasing the lock.

We always check rlock_count before rlock_owner anyway but, yes, I could
reset rlock_owner out of safety.

http://codereview.appspot.com/150055
msg95019 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 15:17
> rlock_acquire_doc: "(...) return None once the lock is acquired".
> That's wrong, acquire() always return a boolean (True or False).

You're right, I should fix that docstring.

> rlock_release(): Is the assert(self->rlock_count > 0); really required?
> You're checking its value few lines before.

Right again :) I forgot this was unsigned.

> rlock_acquire_restore(): raise an error if the lock acquire failed,
> whereas rlock_acquire() doesn't. Why not raising an error in
> rlock_acquire() (especially if blocking is True)?

For rlock_acquire(), I mimicked what the Python code (_PyRLock.acquire)
does: if locking fails, it returns False instead. It is part of the API.

(and I agree this is not necessarily right, because failing to lock if
blocking is True would probably indicate a low-level system error, but
the purpose of the patch is not to change the API)

But you're right that the Python version of rlock_acquire_restore()
doesn't check the return code either, so I may remove this check from
the C code, although the rest of the method clearly assumes the lock has
been taken.

> rlock_acquire_restore(): (maybe) set owner to 0 if count is 0.
> 
> rlock_release_save(): may also reset self->rlock_owner to 0, as
> rlock_release().

As I said to Gregory, the current code doesn't rely on rlock_owner to be
reset but, yes, we could still add it out of safety.

> rlock_new(): why not reusing type instead of Py_TYPE(self) in
> "Py_TYPE(self)->tp_free(self)"?

Good point.

> __exit__: rlock_release() is defined as __exit__() with METH_VARARGS,
> but it has no argument. Can it be a problem?

I just mimicked the corresponding declarations in the non-recursive lock
object. Apparently it's safe on all architectures Python has been
running on for years, although I agree it looks strange.

> If your patch is commited, you can reconsider #3618 (possible deadlock
> in python IO implementation).

Indeed.

Thanks !
msg95022 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 17:47
Here is an updated patch. I addressed all review comments, except the
one about acquire_restore() checking the return result of acquire(),
because I think it's really a weakness in the Python implementation.
msg95023 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-11-07 19:44
Can you make the C implementation's repr() show something similar to the 
Python implementation?
msg95024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-07 20:12
Yes, here is a new patch adding tp_repr.
msg95042 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-11-08 14:10
rlock4.patch looks correct and pass test_threading.py tests.
msg95127 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-11-10 18:52
I've committed the latest patch in r76189. Thanks for the reviews, everyone.
History
Date User Action Args
2009-11-10 18:52:30pitrousetstatus: open -> closed
resolution: fixed
messages: + msg95127
2009-11-08 14:10:34hayposetmessages: + msg95042
2009-11-07 20:12:18pitrousetfiles: + rlock4.patch

messages: + msg95024
2009-11-07 19:44:51gregory.p.smithsetmessages: + msg95023
2009-11-07 17:47:29pitrousetfiles: + rlock3.patch

messages: + msg95022
2009-11-07 15:17:14pitrousetmessages: + msg95019
2009-11-07 15:06:57pitrousetmessages: + msg95018
2009-11-07 08:57:22hayposetmessages: + msg95014
2009-11-07 07:48:07gregory.p.smithsetmessages: + msg95012
2009-11-07 01:13:01pitrousetstage: needs patch -> patch review
2009-11-07 01:07:52pitrousetfiles: + rlock2.patch

messages: + msg95007
2009-11-06 00:57:42pitrousetassignee: pitrou
versions: + Python 3.2, - Python 3.1, Python 2.7
nosy: gregory.p.smith, jcea, Rhamphoryncus, pitrou, hgibson50, haypo, giampaolo.rodola, kevinwatters, sserrano
components: - Interpreter Core
stage: patch review -> needs patch
2009-05-16 19:35:27ajaksu2setstage: patch review
2008-12-02 18:02:37kevinwatterssetnosy: + kevinwatters
2008-10-09 05:39:10hgibson50setnosy: + hgibson50
messages: + msg74555
2008-09-22 17:08:53hayposetfiles: - rlock.patch
2008-08-20 19:00:35pitrousetmessages: + msg71568
versions: - Python 2.6, Python 3.0
2008-08-20 18:28:38gregory.p.smithsetmessages: + msg71564
versions: + Python 3.1, Python 2.7
2008-08-20 14:17:05pitrousetmessages: + msg71546
2008-08-20 14:06:15hayposetfiles: + rlock-v2.patch
messages: + msg71544
2008-08-20 13:49:40hayposetfiles: + rlock.patch
nosy: + haypo
messages: + msg71540
keywords: + patch
2008-07-07 05:01:53gregory.p.smithsetpriority: normal
nosy: + gregory.p.smith
components: + Interpreter Core, Library (Lib)
2008-06-21 19:22:15pitrousetmessages: + msg68536
2008-06-21 16:40:52sserranosetnosy: + sserrano
messages: + msg68512
2008-06-04 21:33:34pitrousetnosy: + pitrou
messages: + msg67703
2008-06-03 22:54:11giampaolo.rodolasetnosy: + giampaolo.rodola
2008-05-29 20:06:20Rhamphoryncussetnosy: + Rhamphoryncus
2008-05-29 15:28:10jceacreate