Created on 2009-08-05 20:23 by jnoller, last changed 2013-03-31 00:12 by sbt.
|test.zip||jnoller, 2009-08-05 20:23|
|msg91332 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-08-05 20:23|
I have example code to show this. It creates a system-wide memory leak on Linux/Unix (present until the next reboot), unless the last statement in the target of mp.Process ensures a manual clean up of the globals. The problem is line 353 in multiprocessing/forking.py. The function exit() is defined as os._exit on Linux and ExitProcess on Windows, none of which allows normal clean up. >>> help(os._exit) Help on built-in function _exit in module nt: _exit(...) _exit(status) Exit to the system with specified status, without normal exit processing. The problem is fixed if line 353 in forking.py is changed from exit(exitcode) to sys.exit(exitcode) Test run without bugfix: G:\DEVELO~1\SHARED~2>python test.py open handle to 569f439b24e24fc8a547b81932616066 [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] open handle to 0582d4b161c546f582c1c96e7bd0c39d open handle to 569f439b24e24fc8a547b81932616066 modified array closed handle to 569f439b24e24fc8a547b81932616066 [[ 1. 1. 1. 0.] [ 1. 1. 1. 0.]] closed handle to 569f439b24e24fc8a547b81932616066 You can see here that opening and closing of handles are unmatched. This is on Windows, where the kernel ensures the clean-up, so it may not matter. But on Unix this would have created a permament (system wide) memory leak! What is happening here is globals not being cleaned up due to the use of os._exit instead of sys.exit. Test run with bugfix: G:\DEVELO~1\SHARED~2>python test.py open handle to 930778d27b414253bc329f2b70adaa05 [[ 0. 0. 0. 0.] [ 0. 0. 0. 0.]] open handle to 3f6cebf8c5de413685bb770d02ae9666 open handle to 930778d27b414253bc329f2b70adaa05 modified array closed handle to 930778d27b414253bc329f2b70adaa05 closed handle to 3f6cebf8c5de413685bb770d02ae9666 [[ 1. 1. 1. 0.] [ 1. 1. 1. 0.]] closed handle to 930778d27b414253bc329f2b70adaa05 Now all allocations and deallocations are matched. Regards, Sturla Molden
|msg91333 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-08-05 20:25|
Additional comments from Sturla: Hello Jesse, Yes there is a bug in multiprocessing. Diagnosis: - Processes created by multiprocessing (mp.Process or mp.Pool) exit in a way that prevents the Python interpreter from running deallocation code for all extension objects (only the locals are cleaned up). Resources allocated by extension objects referenced in globals may leak permanently. Sub-processes seem to commit an ungraceful suicide on exit. If the kernel cleans up after a non-graceful exit this is ok. But if the kernel do not, as in the case of System V IPC objects, we have a permanent resource leak. This is very similar to the reason why manually killing threads is prohibited in Python. I have example code to show this. It creates a system-wide memory leak on Linux/Unix (present until the next reboot), unless the last statement in the target of mp.Process ensures a manual clean up of the globals.
|msg91334 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-08-05 20:39|
> Calling os.exit in a child process may be dangerous. It can cause > unflushed buffers to be flushed twice: once in the parent and once in > the child. I assume you mean sys.exit. If this is the case, multiprocessing needs a mechanism to chose between os._exit and sys.exit for child processes. Calling os._exit might also be dangerous because it could prevent necessary clean-up code from executing (e.g. in C extensions). I had a case where shared memory on Linux (System V IPC) leaked due to os._exit. The deallocator for my extension type never got to execute in child processes. The deallocator was needed to release the shared segment when its reference count dropped to 0. Changing to sys.exit solved the problem. On Windows there was no leak, because the kernel did the reference counting.
|msg91335 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-08-05 20:48|
> In the future please use the bug tracker to file and track bugs with, > so things are not as lossy. Ok, sorry :) Also see Piet's comment here. He has a valid case against sys.exit in some cases. Thus it appears that both ways of shutting down child processes might be dangerous: If we don't want buffers to flush we have to use os._exit. If we want clean-up code to execute we have to use sys.exit. If we want both we are screwed. :(
|msg185597 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2013-03-30 22:49|
Richard, do you think this is an actual concern?
|msg185602 - (view)||Author: Richard Oudkerk (sbt) *||Date: 2013-03-31 00:12|
I don't think this is a bug -- processes started with fork() should nearly always be exited with _exit(). And anyway, using sys.exit() does *not* guarantee that all deallocators will be called. To be sure of cleanup at exit you could use (the undocumented) multiprocessing.util.Finalize(). Note that Python 3.4 on Unix will probably offer the choice of using os.fork()/os._exit() or _posixsubprocess.fork_exec()/sys.exit() for starting/exiting processes on Unix. Sturla's scheme for doing reference counting of shared memory is also flawed because reference counts can fall to zero while a shared memory object is in a pipe/queue, causing the memory to be prematurely deallocated. I think a more reliable scheme would be to use fds created using shm_open(), immediately unlinking the name with shm_unlink(). Then one could use the existing infrastructure for fd passing and let the operating system handle the reference counting. This would prevent leaked shared memory (unless the process is killed in between shm_open() and shm_unlink()). I would like to add something like this to multiprocessing.
|2013-03-31 00:12:37||sbt||set||messages: + msg185602|
+ pitrou, sbt|
messages: + msg185597
versions: + Python 3.3, Python 3.4, - Python 3.1, Python 3.2
jnoller, asksol, schlesin|
versions: + Python 3.1, Python 2.7, Python 3.2
type: resource usage
components: + Library (Lib)
stage: needs patch
|2009-08-05 20:48:57||jnoller||set||messages: + msg91335|
|2009-08-05 20:39:52||jnoller||set||messages: + msg91334|
|2009-08-05 20:27:19||jnoller||set||assignee: jnoller|
|2009-08-05 20:25:01||jnoller||set||messages: + msg91333|