Issue3399
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008-07-17 22:22 by mark.dickinson, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
issue3399.patch | mark.dickinson, 2008-08-01 15:37 | Possible fix | ||
issue3399_2.patch | mark.dickinson, 2008-08-01 16:33 | Updated patch | ||
add_semicolons.diff | jnoller, 2008-08-01 17:58 |
Messages (19) | |||
---|---|---|---|
msg69917 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-17 22:22 | |
As of revision 65077 of the trunk, I'm getting errors in test_multiprocessing that seem to point to memory corruption in object allocation/deallocation. The failures are intermittent, and of a similar nature to the errors I was seeing previously, outlined in issue 3088. The platform is OS X 10.5.4 (not a fresh install---it was an upgrade from OS X 10.4, in case this makes any difference), running on a MacBook Pro. I'm running a freshly checked out debug build of the trunk. Here's what I did: (1) make a fresh svn+ssh checkout of the trunk (2) ./configure --with-pydebug && make (3) ./python.exe Lib/test/test_multiprocessing.py (4) repeat step (3) until something nasty happens. The results vary from run to run, and 80-90% of the runs of test_multiprocessing pass. Here are 3 of the failures I've seen, occurring on three separate runs of test_multiprocessing. Failure 1: test_notify_all (__main__.WithManagerTestCondition) ... Assertion failed: (pool->ref.count > 0), function PyObject_Free, file Objects/obmalloc.c, line 1100. Failure 2: test_imap_unordered (__main__.WithManagerTestPool) ... python.exe(32381,0xb0513000) malloc: *** error for object 0xdbdbdbdb: pointer being reallocated was not allocated *** set a breakpoint in malloc_error_break to debug python.exe(32381,0xb0513000) malloc: *** error for object 0xdbdbdbdb: Non-aligned pointer being freed *** set a breakpoint in malloc_error_break to debug Fatal Python error: UNREF invalid object ERROR Failure 3: test_imap_unordered (__main__.WithManagerTestPool) ... Fatal Python error: UNREF invalid object ERROR I have very little (i.e. no) experience of debugging this kind of failure, and little understanding of how the multiprocessing module works. But I can and will follow instructions and suggestions about how to debug this. Stupid question: it appears from reading the comments in that file that obmalloc.c is (intentionally) not thread-safe. Could this have anything to do with the failures above? |
|||
msg69921 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-17 22:46 | |
And one more: Failure 4: test_make_pool (__main__.WithManagerTestPool) ... Assertion failed: (bp != NULL), function PyObject_Malloc, file Objects/obmalloc.c, line 746. |
|||
msg69924 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-17 23:13 | |
And another: Failure 5: test_notify (__main__.WithManagerTestCondition) ... Assertion failed: (usable_arenas->freepools == NULL), function PyObject_Malloc, file Objects/obmalloc.c, line 809. ERROR |
|||
msg69925 - (view) | Author: Jesse Noller (jnoller) * | Date: 2008-07-17 23:13 | |
On Jul 17, 2008, at 6:22 PM, Mark Dickinson <report@bugs.python.org> wrote: > > New submission from Mark Dickinson <dickinsm@gmail.com>: > > As of revision 65077 of the trunk, I'm getting errors in > test_multiprocessing that seem to point to memory corruption in object > allocation/deallocation. The failures are intermittent, and of a > similar nature to the errors I was seeing previously, outlined in > issue > 3088. > > The platform is OS X 10.5.4 (not a fresh install---it was an upgrade > from OS X 10.4, in case this makes any difference), running on a > MacBook > Pro. I'm running a freshly checked out debug build of the trunk. > > Here's what I did: > > (1) make a fresh svn+ssh checkout of the trunk > (2) ./configure --with-pydebug && make > (3) ./python.exe Lib/test/test_multiprocessing.py > (4) repeat step (3) until something nasty happens. > > The results vary from run to run, and 80-90% of the runs of > test_multiprocessing pass. Here are 3 of the failures I've seen, > occurring on three separate runs of test_multiprocessing. > I am/was going to help you with this when you emailed me your last email - I'm disturbed none of my machines or the buildbots for that matter are seeing this. Can you post the output from: Echo $LD_LIBRARY_PATH which gcc gcc -v |
|||
msg69926 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-17 23:17 | |
LD_LIBRARY_PATH isn't set. gcc is the system gcc from Apple: Macintosh-3:trunk dickinsm$ echo $LD_LIBRARY_PATH Macintosh-3:trunk dickinsm$ which gcc /usr/bin/gcc Macintosh-3:trunk dickinsm$ gcc -v Using built-in specs. Target: i686-apple-darwin9 Configured with: /var/tmp/gcc/gcc-5484~1/src/configure --disable- checking -enable-werror --prefix=/usr --mandir=/share/man --enable- languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.- ]*$/s/$/-4.0/ --with-gxx-include-dir=/include/c++/4.0.0 --with- slibdir=/usr/lib --build=i686-apple-darwin9 --with-arch=apple --with- tune=generic --host=i686-apple-darwin9 --target=i686-apple-darwin9 Thread model: posix gcc version 4.0.1 (Apple Inc. build 5484) |
|||
msg69930 - (view) | Author: Jesse Noller (jnoller) * | Date: 2008-07-18 00:13 | |
Can you try removing the --with-pydebug flag from configure and running that way? |
|||
msg69941 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-18 06:37 | |
Okay: I just tried the following: (1) clean svn checkout (2) ./configure && make (3) 100 runs of test_multiprocessing, via the shell command: for ((i=0;i<100;i+=1)); do ./python.exe Lib/test/test_multiprocessing.py; sleep 1; done I got 4 failed runs out of those 100 runs (details below); 2 hangs in test_notify_all, a KeyError in test_remote, and a failure of test_number_of_objects. Failed run 1 ------------ test_notify_all (__main__.WithManagerTestCondition) ... Process Process- 48: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 600, in f cond.acquire() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 946, in acquire return self._callmethod('acquire', (blocking,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 718, in _callmethod self._connect() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 705, in _connect conn = self._Client(self._token.address, authkey=self._authkey) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/connection.py", line 133, in Client c = SocketClient(address) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/connection.py", line 254, in SocketClient s.connect(address) File "<string>", line 1, in connect error: [Errno 61] Connection refused ^CProcess PoolWorker-5:4: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap Process PoolWorker-5:3: Traceback (most recent call last): self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 337, in get self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker racquire() KeyboardInterrupt task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 337, in get racquire() KeyboardInterrupt Process PoolWorker-5:1: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap Process Process-50: Process Process-49: Traceback (most recent call last): self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 339, in get self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 602, in f return recv() KeyboardInterrupt cond.wait(timeout) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 959, in wait return self._callmethod('wait', (timeout,)) Traceback (most recent call last): File "Lib/test/test_multiprocessing.py", line 1786, in <module> File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt main() File "Lib/test/test_multiprocessing.py", line 1783, in main test_main(unittest.TextTestRunner(verbosity=2).run) File "Lib/test/test_multiprocessing.py", line 1773, in test_main run(suite) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 750, in run Process PoolWorker-5:2: test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 461, in __call__ return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 457, in run test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 461, in __call__ return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 457, in run test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 300, in __call__ return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 279, in run testMethod() File "Lib/test/test_multiprocessing.py", line 701, in test_notify_all sleeping.acquire() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 946, in acquire return self._callmethod('acquire', (blocking,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 602, in f cond.wait(timeout) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 959, in wait return self._callmethod('wait', (timeout,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt Traceback (most recent call last): Failed run 2 ------------ test_task_done (__main__.WithManagerTestQueue) ... ok test_remote (__main__.WithManagerTestRemoteManager) ... ERROR test_bounded_semaphore (__main__.WithManagerTestSemaphore) ... ok test_semaphore (__main__.WithManagerTestSemaphore) ... ok test_timeout (__main__.WithManagerTestSemaphore) ... ok test_getobj_getlock (__main__.WithManagerTestValue) ... ok test_rawvalue (__main__.WithManagerTestValue) ... ok test_value (__main__.WithManagerTestValue) ... ok test_number_of_objects (__main__.WithManagerTestZZZNumberOfObjects) ... ok ====================================================================== ERROR: test_remote (__main__.WithManagerTestRemoteManager) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_multiprocessing.py", line 1157, in test_remote queue = manager2.get_queue() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 635, in temp authkey=self._authkey, exposed=exp File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 887, in AutoProxy incref=incref) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 696, in __init__ self._incref() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 743, in _incref dispatch(conn, None, 'incref', (self._id,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 79, in dispatch raise convert_to_error(kind, result) RemoteError: ------------------------------------------------------------------------ --- Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 181, in handle_request result = func(c, *args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 397, in incref self.id_to_refcount[ident] += 1 KeyError: '5bf968' ------------------------------------------------------------------------ --- ---------------------------------------------------------------------- Ran 121 tests in 9.230s FAILED (errors=1) Failed run 3 ------------ test_number_of_objects (__main__.WithManagerTestZZZNumberOfObjects) ... 680490: refcount=1 <threading._Semaphore object at 0x680490> 680bd0: refcount=1 <multiprocessing.pool.Pool object at 0x680bd0> FAIL ====================================================================== FAIL: test_number_of_objects (__main__.WithManagerTestZZZNumberOfObjects) ---------------------------------------------------------------------- Traceback (most recent call last): File "Lib/test/test_multiprocessing.py", line 1042, in test_number_of_objects self.assertEqual(refs, EXPECTED_NUMBER) AssertionError: 2 != 1 ---------------------------------------------------------------------- Ran 121 tests in 9.228s FAILED (failures=1) Failed run 4 ------------ test_notify_all (__main__.WithManagerTestCondition) ... Process Process- 50: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 600, in f cond.acquire() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 946, in acquire return self._callmethod('acquire', (blocking,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 718, in _callmethod self._connect() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 705, in _connect conn = self._Client(self._token.address, authkey=self._authkey) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/connection.py", line 133, in Client c = SocketClient(address) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/connection.py", line 254, in SocketClient s.connect(address) File "<string>", line 1, in connect error: [Errno 61] Connection refused ^CProcess PoolWorker-5:4: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap Process PoolWorker-5:3: Traceback (most recent call last): self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 337, in get racquire() self.run() KeyboardInterrupt File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 337, in get racquire() KeyboardInterrupt Process PoolWorker-5:1: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap Process Process-48: self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker task = get() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 339, in get File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap return recv() KeyboardInterrupt self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 602, in f cond.wait(timeout) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 959, in wait Traceback (most recent call last): File "Lib/test/test_multiprocessing.py", line 1786, in <module> return self._callmethod('wait', (timeout,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt main() File "Lib/test/test_multiprocessing.py", line 1783, in main test_main(unittest.TextTestRunner(verbosity=2).run) File "Lib/test/test_multiprocessing.py", line 1773, in test_main Process PoolWorker-5:2: run(suite) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 750, in run Traceback (most recent call last): test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 461, in __call__ File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 457, in run test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 461, in __call__ self.run() return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 457, in run File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/pool.py", line 57, in worker test(result) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 300, in __call__ return self.run(*args, **kwds) File "/Users/dickinsm/python_source/trunk/Lib/unittest.py", line 279, in run task = get() testMethod() File "Lib/test/test_multiprocessing.py", line 701, in test_notify_all File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/queues.py", line 337, in get sleeping.acquire() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 946, in acquire racquire() KeyboardInterrupt return self._callmethod('acquire', (blocking,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt Process Process-49: Traceback (most recent call last): File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "Lib/test/test_multiprocessing.py", line 602, in f cond.wait(timeout) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 959, in wait return self._callmethod('wait', (timeout,)) File "/Users/dickinsm/python_source/trunk/Lib/multiprocessing/managers.py", line 722, in _callmethod kind, result = conn.recv() KeyboardInterrupt |
|||
msg69942 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-18 06:40 | |
I should add to the previous message that this was revision 65090, and that it was a non-debug build. |
|||
msg69956 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-07-18 15:08 | |
It looks like this isn't just me. See the buildbot output at: http://www.python.org/dev/buildbot/all/x86%20osx.5%20trunk/builds/33/ste p-test/0 which shows: test_multiprocessing Assertion failed: (bp != NULL), function PyObject_Malloc, file Objects/obmalloc.c, line 746. test test_multiprocessing failed -- errors occurred; run in verbose mode for details |
|||
msg70023 - (view) | Author: Jesse Noller (jnoller) * | Date: 2008-07-19 13:24 | |
Ok, so for the moment, let's set aside the connection refused messages: that may be a case of not cleaning up a socket correctly (which is still bad, but not memory corruption). Of note from the buildbot failure: Assertion failed: (bp != NULL), function PyObject_Malloc, file Objects/obmalloc.c, line 746. test test_multiprocessing failed -- errors occurred; run in verbose mode for details I don't know enough about obmalloc.c to state if this is a problem with it not being multithreaded Here's another failure (from my own buildbot to boot): test_multiprocessing /Users/buildbot/buildarea/trunk.noller- osx86/build/Lib/multiprocessing/__init__.py:82: ImportWarning: Not importing directory '/Users/buildbot/buildarea/trunk.noller- osx86/build/Modules/_multiprocessing': missing __init__.py import _multiprocessing Fatal Python error: Objects/tupleobject.c:169 object at 0x539d538 has negative ref count -606348326 make: *** [buildbottest] Abort trap program finished with exit code 2 |
|||
msg70551 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-08-01 13:50 | |
I finally found some more time to look at this. I cut down the test-suite to try to find a minimal failing example. I can fairly reliably make a debug build of the trunk crash using the following nine lines import multiprocessing.managers def sqr(x): return x*x manager = multiprocessing.managers.SyncManager() manager.start() pool = manager.Pool(4) it = pool.imap_unordered(sqr, range(10000)) assert sorted(it) == map(sqr, range(10000)) pool.terminate() manager.shutdown() Typical output is: Fatal Python error: UNREF invalid object (followed by traceback) or: Assertion failed: (bp != NULL), function PyObject_Malloc, file Objects/obmalloc.c, line 755. or: Debug memory block at address p=0x247778: 26 bytes originally requested The 4 pad bytes at p-4 are not all FORBIDDENBYTE (0xfb): at p-4: 0xdb *** OUCH at p-3: 0xdb *** OUCH at p-2: 0xdb *** OUCH at p-1: 0xdb *** OUCH Because memory is corrupted at the start, the count of bytes requested may be bogus, and checking the trailing pad bytes may segfault. The 4 pad bytes at tail=0x247792 are not all FORBIDDENBYTE (0xfb): at tail+0: 0x35 *** OUCH at tail+1: 0x00 *** OUCH at tail+2: 0xfb at tail+3: 0xfb The block was made by call #4227530756 to debug malloc/realloc. Data at p: 00 00 00 00 00 00 00 00 ... 00 00 08 00 00 00 b0 72 Fatal Python error: bad leading pad byte |
|||
msg70554 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2008-08-01 14:20 | |
> Assertion failed: (bp != NULL), function PyObject_Malloc, file > Objects/obmalloc.c, line 755. This one gives one probable cause of the problem: - in Modules/_multiprocessing/connection.h, connection_send_obj() releases the GIL around a call to conn_send_string(). - in Modules/_multiprocessing/socket_connection.c, conn_send_string() uses PyMem_Malloc() This is wrong (the GIL must be held when using the PyMem_* and PyObject_* functions), and is probably the cause of the failed assertion. |
|||
msg70556 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-08-01 14:31 | |
> This is wrong (the GIL must be held when using the PyMem_* and > PyObject_* functions), and is probably the cause of the failed assertion. This sounds quite likely. I just managed (using the low-tech method of setting a static variable on entry and clearing it on exit) to confirm that PyObject_Malloc in obmalloc.c is being accessed simultaneously by multiple threads when test_multiprocessing is run. |
|||
msg70560 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-08-01 15:37 | |
Here's a patch that fixes the problem for me. It releases the GIL around the calls to _conn_sendall within conn_send_string, instead of releasing the GIL for the whole call to conn_send_string. |
|||
msg70564 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2008-08-01 15:57 | |
To be complete, the patch should also deal with conn_recv_string() which has the same problem. And please do not forget the win32 implementation, in pipe_connection.c. |
|||
msg70569 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2008-08-01 16:33 | |
Thanks, Amaury! How's this? I have no access to a Windows machine, so this patch is untested on Windows. |
|||
msg70577 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2008-08-01 17:54 | |
Mark, There are 3 semicolons missing in your patch, in pipe_connection.c, just after the calls to WriteFile and ReadFile. After this, it compiles and runs correctly. Tests pass. Note that on Windows, your "nine lines" cannot work as is, because the processes are spawned, not forked: the sqr() function is not copied. And if you save the lines in a script file, it will be imported by every subprocess, and every subprocess will start its own manager... and memory explodes. import multiprocessing.managers def sqr(x): return x*x if __name__ == '__main__': manager = multiprocessing.managers.SyncManager() manager.start() pool = manager.Pool(4) it = pool.imap_unordered(sqr, range(1000)) assert sorted(it) == [sqr(x) for x in range(1000)] pool.terminate() manager.shutdown() |
|||
msg70580 - (view) | Author: Jesse Noller (jnoller) * | Date: 2008-08-01 17:58 | |
I added the semicolons Amaury, and have it teed up in my local repo for submit. Can you review this diff just to confirm? |
|||
msg70589 - (view) | Author: Jesse Noller (jnoller) * | Date: 2008-08-01 19:50 | |
I've committed this as-is based off my last patch. I will watch the buildbots for failures. Mark/Amaury - if I see you guys at pycon, I owe you a drink. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:36 | admin | set | github: 47649 |
2008-08-01 19:50:50 | jnoller | set | status: open -> closed resolution: fixed messages: + msg70589 |
2008-08-01 17:58:52 | jnoller | set | files:
+ add_semicolons.diff messages: + msg70580 |
2008-08-01 17:54:11 | amaury.forgeotdarc | set | messages: + msg70577 |
2008-08-01 16:33:08 | mark.dickinson | set | files:
+ issue3399_2.patch messages: + msg70569 |
2008-08-01 15:57:02 | amaury.forgeotdarc | set | messages: + msg70564 |
2008-08-01 15:37:24 | mark.dickinson | set | files:
+ issue3399.patch keywords: + patch messages: + msg70560 |
2008-08-01 14:31:25 | mark.dickinson | set | messages: + msg70556 |
2008-08-01 14:20:53 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg70554 |
2008-08-01 13:51:00 | mark.dickinson | set | messages: + msg70551 |
2008-07-19 13:25:00 | jnoller | set | messages: + msg70023 |
2008-07-18 15:08:51 | mark.dickinson | set | messages: + msg69956 |
2008-07-18 06:40:18 | mark.dickinson | set | messages: + msg69942 |
2008-07-18 06:37:47 | mark.dickinson | set | messages: + msg69941 |
2008-07-18 00:13:26 | jnoller | set | messages: + msg69930 |
2008-07-17 23:17:39 | mark.dickinson | set | messages: + msg69926 |
2008-07-17 23:13:54 | jnoller | set | messages: + msg69925 |
2008-07-17 23:13:37 | mark.dickinson | set | messages: + msg69924 |
2008-07-17 22:46:55 | mark.dickinson | set | messages: + msg69921 |
2008-07-17 22:22:25 | mark.dickinson | create |