Author nascheme
Recipients carljm, corona10, dino.viehland, eelizondo, gregory.p.smith, nascheme, pablogsal, pitrou, shihai1991, steve.dower, tim.peters, vstinner
Date 2020-04-15.18:24:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1586975064.62.0.506363193586.issue40255@roundup.psfhosted.org>
In-reply-to
Content
Eddie mentions in the PR about using memory arenas to contain immortal objects.  I think it could be worth investigating further.  With the current PR, the immortal status is dependent on the value of the refcnt field of the object.  Using immortal arenas might be faster because you can check if an object is immortal just based on its address (no data dependency on refcnt value).  

The fastest would be to create an immortal block as part of the BSS (uninitialized data).  Then, within incref/decref you use the location and size of the immortal arena (compile time constants) to test if the object is immortal.  Maybe could just check if the high bits of an object address match a constant mask (if immortal region is aligned and is power of 2 in size).  Slightly slower but more flexible would be to make the immortal arena size and location global variables.  That way, you can set the size of the region on startup.  Also would be more flexible in terms of ABI compatibility.  That would introduce one or two global loads within incref/decref but those values would almost certainly be in L1 cache.

Even more flexible would be to use a memory map to mark which arenas are immortal.  See my radix tree implementation for obmalloc:

https://bugs.python.org/issue37448

I would guess the radix tree lookups are too expensive to put in incref/decref.  Should probably test that though.

I had started doing an experiment with the arena approach before I noticed Eddie's comment about it.  I would like to see his version.  Here is a sketch of mine (not working yet):

- change new_arena() to initially allocate from an "immortal memory" region.  There are multiple ways to allocate that (BSS/uninitialized data, aligned_alloc(), etc).

- add a _PyMem_enable_immortal() call to switch obmalloc from using immortal arenas to regular ones.  Need to mess with some obmalloc data structures to switch arenas (usedpools, usable_arenas, unused_arena_objects).

- change incref/decref to check if immortal status has been enabled and if object address falls within immortal region.  If so, incref/decref don't do anything.

By default, the immortal arenas could be enabled on Python startup.  Call _PyMem_enable_immortal after startup but before running user code.  There could be a command-line option to disable the automatic call to _PyMem_enable_immortal() so that users like Instagram can do their pre-fork initialization before calling it.

Next step down the rabbit-hole could be to use something like Jeethu Rao's frozen module serializer and dump out all of the startup objects and put them into the BSS:

 https://bugs.python.org/issue34690
History
Date User Action Args
2020-04-15 18:24:24naschemesetrecipients: + nascheme, tim.peters, gregory.p.smith, pitrou, vstinner, carljm, dino.viehland, steve.dower, corona10, pablogsal, eelizondo, shihai1991
2020-04-15 18:24:24naschemesetmessageid: <1586975064.62.0.506363193586.issue40255@roundup.psfhosted.org>
2020-04-15 18:24:24naschemelinkissue40255 messages
2020-04-15 18:24:23naschemecreate