classification
Title: obmalloc: radix tree for tracking arena address ranges
Type: enhancement Stage: patch review
Components: Interpreter Core Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: inada.naoki, nascheme, tim.peters
Priority: normal Keywords: patch

Created on 2019-06-29 23:08 by nascheme, last changed 2019-06-30 02:58 by xtreak.

Files
File name Uploaded Description Edit
pyperf_radix_compare.txt nascheme, 2019-06-29 23:12
obmalloc_overhead.py nascheme, 2019-06-29 23:12
Pull Requests
URL Status Linked Edit
PR 14474 open nascheme, 2019-06-29 23:13
Messages (1)
msg346903 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2019-06-29 23:08
This patch implements an alternative version of obmalloc's address_in_range().  It uses a radix tree to map the areas of memory covered by obmalloc arenas.  pymalloc_free() uses address_in_range() to determine if a block of memory is controlled by obmalloc or has been allocated by the system allocator.  The address_in_range() function must be fast.

The current version of address_in_range() uses a slightly memory unsanitary scheme.  I.e. it reads memory that has possibly not been initialized.  In theory that is not allowed and could cause a crash.  In practice, it has worked reliability for many years.  There is some ugly logic in obmalloc.c to disable sanity checking (e.g. ASN, TSAN, MSAN).  One advantage of this radix tree approach is that it doesn't have this unsanitary behavior.

Another small advantage of the radix tree approach is that it is independent from the OS page size.  With the current address_in_range() scheme, the size of obmalloc pools are limited to the size of the OS page size.  If larger, a segmentation fault could occur.  Bug #37211 (obmalloc: eliminate limit on pool size) allows for larger pools but at the cost of some additional code complexity and some additional memory overhead.

This patch has been discussed quite a bit on the python-dev list.  Thread subjects are:
    obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)
    radix tree arena map for obmalloc

That discussion focuses quite a bit on the value of increasing the obmalloc arena and pool sizes.  This proposed patch keeps the sizes the same.  We can evaluate changing the sizes as a different issue.

I have run the pyperformance benchmark suite to compare performance of the radix tree with the status quo.  I think there is no significant difference for the programs that are part of the suite.  The pyperformance comparision report is attached.

If we do decide to go with larger pool and arena sizes, the radix tree approach actually uses less memory than the "big pools PR" (Bug #37211).  Attached is Tim's program to compare memory overheads of the different approaches.

I think it is worth considering using the radix tree by default on 64-bit platforms.  It seems to be not any slower than the status quo, does not have the memory unsanitary behavior and gives us the ability to easily increase pool sizes if we decide we want to.
History
Date User Action Args
2019-06-30 02:58:30xtreaksetnosy: + inada.naoki
2019-06-30 02:27:26tim.peterssetnosy: + tim.peters
2019-06-29 23:13:56naschemesetkeywords: + patch
pull_requests: + pull_request14291
2019-06-29 23:12:46naschemesetfiles: + obmalloc_overhead.py
2019-06-29 23:12:21naschemesetfiles: + pyperf_radix_compare.txt
2019-06-29 23:08:22naschemecreate