Message215258
It is currently impossible to create multiprocessing shared arrays larger than 50% of memory size under linux (and I assume other unices). A simple test case would be the following:
from multiprocessing.sharedctypes import RawArray
import ctypes
foo = RawArray(ctypes.c_double, 10*1024**3//8) # Allocate 10GB array
If the array is larger than 50% of the total memory size, the process get SIGKILL'ed by the OS. Deactivate the swap for better effects.
Naturally this requires that the tmpfs max size is large enough, which is the case here, 15GB max with 16GB of RAM.
I have tracked down the problem to multiprocessing/heap.py. The guilty line is: f.write(b'\0'*size). Indeed, for very large sizes it is going to create a large intermediate array (10 GB in my test case) and as much memory is going to be allocated to the new shared array, leading to a memory consumption over the limit.
To solve the problem, I have split the zeroing of the shared array into blocks of 1MB. I can now allocate arrays as large as the tmpfs maximum size. Also it runs a bit faster. On a test case of a 6GB RawArray, 3.4.0 takes a total time of 3.930s whereas it goes down to 3.061s with the attached patch. |
|
Date |
User |
Action |
Args |
2014-03-31 19:37:48 | mboquien | set | recipients:
+ mboquien |
2014-03-31 19:37:47 | mboquien | set | messageid: <1396294667.97.0.287941619482.issue21116@psf.upfronthosting.co.za> |
2014-03-31 19:37:47 | mboquien | link | issue21116 messages |
2014-03-31 19:37:47 | mboquien | create | |
|