This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mboquien
Recipients mboquien
Date 2014-03-31.19:37:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1396294667.97.0.287941619482.issue21116@psf.upfronthosting.co.za>
In-reply-to
Content
It is currently impossible to create multiprocessing shared arrays larger than 50% of memory size under linux (and I assume other unices). A simple test case would be the following:

from multiprocessing.sharedctypes import RawArray
import ctypes

foo = RawArray(ctypes.c_double, 10*1024**3//8)  # Allocate 10GB array

If the array is larger than 50% of the total memory size, the process get SIGKILL'ed by the OS. Deactivate the swap for better effects.

Naturally this requires that the tmpfs max size is large enough, which is the case here, 15GB max with 16GB of RAM.

I have tracked down the problem to multiprocessing/heap.py. The guilty line is: f.write(b'\0'*size). Indeed, for very large sizes it is going to create a large intermediate array (10 GB in my test case) and as much memory is going to be allocated to the new shared array, leading to a memory consumption over the limit.

To solve the problem, I have split the zeroing of the shared array into blocks of 1MB. I can now allocate arrays as large as the tmpfs maximum size. Also it runs a bit faster. On a test case of a 6GB RawArray, 3.4.0 takes a total time of 3.930s whereas it goes down to 3.061s with the attached patch.
History
Date User Action Args
2014-03-31 19:37:48mboquiensetrecipients: + mboquien
2014-03-31 19:37:47mboquiensetmessageid: <1396294667.97.0.287941619482.issue21116@psf.upfronthosting.co.za>
2014-03-31 19:37:47mboquienlinkissue21116 messages
2014-03-31 19:37:47mboquiencreate