classification
Title: Lost updates with multiprocessing.Value
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: lechten, python-dev, sbt
Priority: normal Keywords:

Created on 2013-01-19 19:13 by lechten, last changed 2013-11-17 17:10 by sbt. This issue is now closed.

Files
File name Uploaded Description Edit
test_multiprocessing.py lechten, 2013-01-19 19:13 Test case to produce lost updates.
Messages (6)
msg180251 - (view) Author: Jens Lechtenboerger (lechten) Date: 2013-01-19 19:13
Maybe I'm misreading the documentation of multiprocessing.Value and
multiprocessing.sharedctypes.Value.
I thought that access to the value field of Value instances was protected by
locks to avoid lost updates.

Specifically, for multiprocessing.Value(typecode_or_type, *args[, lock]) and
multiprocessing.sharedctypes.Value(typecode_or_type, *args[, lock]) the
documentation states:
> By default the return value is actually a synchronized wrapper for the
> object. [...]
> If lock is True (the default) then a new lock object is created to
> synchronize access to the value. If lock is a Lock or RLock object then that
> will be used to synchronize access to the value. If lock is False then
> access to the returned object will not be automatically protected by a lock,
> so it will not necessarily be “process-safe”.

(By the way, I'm not sure why both, multiprocessing.Value and
multiprocessing.sharedctypes.Value are documented.  They appear to be the same
thing.)

The following tests (also attached as file) show that lost updates may occur
if several instances of multiprocessing.Process increment the same Value that
is passed as args parameter.

def do_inc(integer):
    """Increment integer.value for multiprocessing.Value integer."""
    integer.value += 1

def do_test(notasks):
    """Create notasks processes, each incrementing the same Value.

    As the Value is initialized to 0, its final value is expected to be
    notasks.
    """
    tasks = list()
    integer = multiprocessing.sharedctypes.Value("i", 0)
    for run in range(notasks):
        proc = multiprocessing.Process(target=do_inc, args=(integer,))
        proc.start()
        tasks.append(proc)
    for proc in tasks:
        proc.join()
    if integer.value != notasks:
        logging.error(
            "Unexpected value: %d (expected: %d)", integer.value, notasks)

if __name__ == "__main__":
    do_test(100)


Sample invocations and results:

Note that on a single CPU machine the error is not reported for every
execution but only for about every third run.
$ python --version
Python 2.6.5
$ uname -a
Linux ubuntu-desktop 2.6.32.11+drm33.2 #2 Fri Jun 18 20:30:49 CEST 2010 i686 GNU/Linux
$ python test_multiprocessing.py
ERROR:root:Unexpected value: 99 (expected: 100)

On a quadcore, the error occurs almost every time.
$ uname -a
Linux PC 2.6.35.13 #4 SMP Tue Dec 20 15:22:02 CET 2011 x86_64 GNU/Linux
$ ~/local/Python-2.7.3/python test_multiprocessing.py
ERROR:root:Unexpected value: 95 (expected: 100)
$ ~/local/Python-3.3.0/python test_multiprocessing.py
ERROR:root:Unexpected value: 86 (expected: 100)
msg180269 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-19 22:18
> I thought that access to the value field of Value instances was 
> protected by locks to avoid lost updates.

Loads and stores are both atomic.  But "+=" is made up of two operations, a load followed by a store, and the lock is dropped between the two.

The same lack of atomicity applies when using "+=" to modify an attribute of a normal python object in a multithreaded program.

If you want an atomic increment you could try

    def do_inc(integer):
        with integer.get_lock():
            integer.value += 1
msg180282 - (view) Author: Jens Lechtenboerger (lechten) Date: 2013-01-20 08:54
> Loads and stores are both atomic.  But "+=" is made up of two operations, a load followed by a store, and the lock is dropped between the two.

I see.  Then this is a documentation bug.  The examples in the documentation use such non-thread-safe assignments (combined with the statement "These shared objects will be process and thread safe.").
msg180286 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-01-20 10:57
> I see.  Then this is a documentation bug.  The examples in the 
> documentation use such non-thread-safe assignments (combined with the 
> statement "These shared objects will be process and thread safe.").

Are you talking about this documentation:

  If lock is True (the default) then a new lock object is created to 
  synchronize access to the value. If lock is a Lock or RLock object then 
  that will be used to synchronize access to the value. If lock is False 
  then access to the returned object will not be automatically protected 
  by a lock, so it will not necessarily be “process-safe”.

It only says that accesses are synchronized.  The problem is that you were assuming that "+=" involves a single access -- but that is not how python works.

Where in the examples is there "non-process-safe" access?  (Note that waiting for the only process which modifies a value to terminate using join() will prevent races.)
msg180343 - (view) Author: Jens Lechtenboerger (lechten) Date: 2013-01-21 14:50
> It only says that accesses are synchronized.  The problem is that you were assuming that "+=" involves a single access -- but that is not how python works.

Yes, I understand that by now (actually since your first comment).

> Where in the examples is there "non-process-safe" access?  (Note that waiting for the only process which modifies a value to terminate using join() will prevent races.)

In section "The multiprocessing.sharedctypes module" the assignments in the first example (function modify()) are unsafe.  None of them is protected by a lock as suggested in your first comment.  Strictly speaking, no lock is necessary in the example as there are no race conditions (the processes work in an alternating fashion without concurrency).

I certainly did not see that the example (for a *shared* memory data structure) is based on the implicit assumption of a single writer.  In contrast, I assumed that some "magic" should offer process-safe usage of "+=", which made me file this bug.  That assumption has turned out to be wrong.  To prevent others from being mislead in the same way I suggest to either protect those operations with locks (and comment on their effect) or to state the implicit assumption explicitly.

Maybe add the following after "Below is an example where a number of ctypes objects are modified by a child process:"
Note that assignments such n.value **= 2 are not executed atomically but involve two operations, a load followed by a store.  Each of these operations is protected by the Value's lock, which is dropped in between.  Thus, in scenarios with concurrent modifying processes you may want to explicitly acquire the Value's lock to ensure atomic execution (avoiding race conditions and lost updates), e.g.:
    with n.get_lock():
        n.value **= 2
msg203202 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-17 17:04
New changeset 7aabbe919f55 by Richard Oudkerk in branch '2.7':
Issue 16998: Clarify that += on a shared value is not atomic.
http://hg.python.org/cpython/rev/7aabbe919f55

New changeset 11cafbe6519f by Richard Oudkerk in branch '3.3':
Issue 16998: Clarify that += on a shared value is not atomic.
http://hg.python.org/cpython/rev/11cafbe6519f
History
Date User Action Args
2013-11-17 17:10:06sbtsetstatus: open -> closed
type: behavior ->
resolution: fixed
stage: resolved
2013-11-17 17:04:48python-devsetnosy: + python-dev
messages: + msg203202
2013-01-21 14:50:38lechtensetmessages: + msg180343
2013-01-20 10:57:28sbtsetmessages: + msg180286
2013-01-20 08:54:47lechtensetmessages: + msg180282
2013-01-19 22:18:08sbtsetmessages: + msg180269
2013-01-19 21:00:37pitrousetnosy: + sbt
2013-01-19 19:13:57lechtencreate