Author Olivier.Grisel
Recipients Olivier.Grisel, mrjbq7, neologix, pitrou, sbt
Date 2013-08-19.17:12:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1376932354.42.0.166286509501.issue17560@psf.upfronthosting.co.za>
In-reply-to
Content
I have implemented a custom subclass of the multiprocessing Pool to be able plug custom pickling strategy for this specific use case in joblib:

https://github.com/joblib/joblib/blob/master/joblib/pool.py#L327

In particular it can:

- detect mmap-backed numpy
- transform large memory backed numpy arrays into numpy.memmap instances prior to pickling using the /dev/shm partition when available or TMPDIR otherwise.

Here is some doc: https://github.com/joblib/joblib/blob/master/doc/parallel_numpy.rst

I could submit the part that makes it possible to customize the picklers of multiprocessing.pool.Pool instance to the standard library if people are interested.

The numpy specific stuff would stay in third party projects such as joblib but at least that would make it easier for people to plug their own optimizations without having to override half of the multiprocessing class hierarchy.
History
Date User Action Args
2013-08-19 17:12:34Olivier.Griselsetrecipients: + Olivier.Grisel, pitrou, mrjbq7, neologix, sbt
2013-08-19 17:12:34Olivier.Griselsetmessageid: <1376932354.42.0.166286509501.issue17560@psf.upfronthosting.co.za>
2013-08-19 17:12:34Olivier.Grisellinkissue17560 messages
2013-08-19 17:12:33Olivier.Griselcreate