Author lukasz.langa
Recipients lukasz.langa, nascheme, vstinner, yselivanov
Date 2017-09-23.00:21:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1506126089.65.0.135296831106.issue31558@psf.upfronthosting.co.za>
In-reply-to
Content
When you're forking many worker processes off of a parent process, the resulting children are initially very cheap in memory.  They share memory pages with the base process until a write happens [1]_.

Sadly, the garbage collector in Python touches every object's PyGC_Head during a collection, even if that object stays alive, undoing all the copy-on-write wins.  Instagram disabled the GC completely for this reason [2]_.  This fixed the COW issue but made the processes more vulnerable to memory growth due to new cycles being silently introduced when the application code is changed by developers.  While we could fix the most glaring cases, it was hard to keep the memory usage at bay.  We came up with a different solution that fixes both issues.  It requires a new API to be added to CPython's garbage collector.


gc.freeze()
-----------

As soon as possible in the lifecycle of the parent process we disable the garbage collector.  Then we call a new API called `gc.freeze()` to move all currently tracked objects to a permanent generation.  They won't be considered in further collections.  This is okay since we are assuming that (almost?) all of the objects created until that point are module-level and thus useful for the entire lifecycle of the child process.

After calling `gc.freeze()` we call fork. Then, the child process is free to re-enable the garbage collector.

Why do we need to disable the collector on the parent process as soon as possible?  When the GC cleans up memory in the mean time, it leaves space in pages for new objects.  Those pages become shared after fork and as soon as the child process starts creating its own objects, they will likely be written to the shared pages, initiating a lot of copy-on-write activity.

In other words, we're wasting a bit of memory in the shared pages to save a lot of memory later (that would otherwise be wasted on copying entire pages after forking).


Other attempts
--------------

We also tried moving the GC head to another place in memory.  This creates some indirection but cache locality on that segment is great so performance isn't really hurt.  However, this change introduces two new pointers (16 bytes) per object.  This doesn't sound like a lot but given millions of objects and tens of processes per box, this alone can cost hundreds of megabytes per host.  Memory that we wanted to save in the first place.  So that idea was scrapped.


Attribution
-----------

The original patch is by Zekun Li, with help from Jiahao Li, Matt Page, David Callahan, Carl S. Shapiro, and Chenyang Wu.


.. [1] https://en.wikipedia.org/wiki/Copy-on-write
.. [2] https://engineering.instagram.com/dismissing-python-garbage-collection-at-instagram-4dca40b29172
History
Date User Action Args
2017-09-23 00:21:30lukasz.langasetrecipients: + lukasz.langa, nascheme, vstinner, yselivanov
2017-09-23 00:21:29lukasz.langasetmessageid: <1506126089.65.0.135296831106.issue31558@psf.upfronthosting.co.za>
2017-09-23 00:21:29lukasz.langalinkissue31558 messages
2017-09-23 00:21:23lukasz.langacreate