This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Allow split key dictionaries with values owned by other objects.
Type: Stage:
Components: Interpreter Core Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Mark.Shannon, methane
Priority: normal Keywords:

Created on 2021-09-17 12:47 by Mark.Shannon, last changed 2022-04-11 14:59 by admin.

Messages (3)
msg402047 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-09-17 12:47
Currently, if a dictionary is split, then the dictionary owns the memory for the values. Unless the values is the unique empty-values array.

In order to support lazily created dictionaries for objects (see https://github.com/faster-cpython/ideas/issues/72#issuecomment-886796360), we need to allow shared keys dicts that do not own the memory of the values.

I propose the following changes to the internals of dict objects.
Add 4 flag bits (these can go in the low bits of the version tag)
2 bit for ownership of values, the other 2 bits for the stride of the values (1 or 3). All dictionaries would then have a non-NULL values pointer.

The value of index `ix` would be always located at `dict->ma_values[ix*stride]`

The two ownership bits indicate whether the dictionary owns the references and whether it owns the memory.
When a dictionary is freed, the items in the values array would be decref'd if the references are owned.
The values array would be freed if the memory is owned.

I don't think it is possible to own the memory, but not the references.

Examples:

A combined dict. Stride = 3, owns_refs = 1, owns_mem = 0.
A split keys dict. Stride = 1, owns_refs = 1, owns_mem = 1.
Empty dict (split). Stride = 1, owns_refs = 0, owns_mem = 0.

Dictionary with values embedded in object (https://github.com/faster-cpython/ideas/issues/72#issuecomment-886796360, second diagram). Stride = 1, owns_refs = 0, owns_mem = 0.
msg402048 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-09-17 12:52
An alternative placement for the flag bits:

Stride bits in the dictkeys.
Ownership bits in the low bits of ma_used.

This would still allow us to remove the version tag at some point.
msg402316 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-09-21 12:59
Experiments show that using `stride` just makes the code more complex,
`dk_kind` is sufficient.

We will still need ownership flags for split dicts, though.
A single flag may suffice.
History
Date User Action Args
2022-04-11 14:59:50adminsetgithub: 89396
2021-09-21 12:59:28Mark.Shannonsetmessages: + msg402316
2021-09-17 12:52:23Mark.Shannonsetmessages: + msg402048
2021-09-17 12:47:38Mark.Shannoncreate