Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more than 16 items in split-keys dicts and "virtual" object dicts. #90833

Closed
markshannon opened this issue Feb 7, 2022 · 2 comments
Closed
Assignees
Labels
3.11 only security fixes

Comments

@markshannon
Copy link
Member

BPO 46675
Nosy @markshannon
PRs
  • bpo-46675: Allow object value arrays and split key dictionaries larger than 16 #31191
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/markshannon'
    closed_at = <Date 2022-03-03.12:16:00.251>
    created_at = <Date 2022-02-07.11:52:04.597>
    labels = ['3.11']
    title = 'Allow more than 16 items in split-keys dicts and "virtual" object dicts.'
    updated_at = <Date 2022-03-03.12:16:00.251>
    user = 'https://github.com/markshannon'

    bugs.python.org fields:

    activity = <Date 2022-03-03.12:16:00.251>
    actor = 'Mark.Shannon'
    assignee = 'Mark.Shannon'
    closed = True
    closed_date = <Date 2022-03-03.12:16:00.251>
    closer = 'Mark.Shannon'
    components = []
    creation = <Date 2022-02-07.11:52:04.597>
    creator = 'Mark.Shannon'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46675
    keywords = ['patch']
    message_count = 2.0
    messages = ['412735', '412830']
    nosy_count = 1.0
    nosy_names = ['Mark.Shannon']
    pr_nums = ['31191']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46675'
    versions = ['Python 3.11']

    @markshannon
    Copy link
    Member Author

    https://bugs.python.org/issue45340 and #28802 allowed "virtual" object dicts (see faster-cpython/ideas#72 for full details).

    In order for this to work, we need to keep the insertion order on the values. The initial version (#28802) used a 64 bit value as a vector of 16 4-bit values, which allows only 16 items per values array.

    Stats gathered from the standard benchmark suite and informal evidence from elsewhere suggests that this causes a significant (5% and upwards) of these dicts to be materialized due to exceeding the 16 item limit.

    An alternative design that would allow up to ~254 items in the values array is to make the insertion order vector an array of bytes. The capacity is 254 as we need a byte for size, and another for capacity.
    This will increase the size of the values a bit for sizes from 7 to 15, but save a lot of memory for sizes 17+, as keys could still be shared.

    Pros:
    No need to materialize dicts of size 16+, saving ~3/4 of the memory per dict and helping specialization.

    Cons:
    Extra memory write to store a value*
    1 extra word for values of size 7 to 14, 2 extra for size 15.
    Some extra complexity.

    *In a hypothetical optimized JIT, the insertion order vector would be stored as a single write for several writes, so this would make no difference.

    @markshannon markshannon added the 3.11 only security fixes label Feb 7, 2022
    @markshannon markshannon self-assigned this Feb 7, 2022
    @markshannon markshannon added the 3.11 only security fixes label Feb 7, 2022
    @markshannon markshannon self-assigned this Feb 7, 2022
    @markshannon
    Copy link
    Member Author

    New changeset 25db2b3 by Mark Shannon in branch 'main':
    bpo-46675: Allow object value arrays and split key dictionaries larger than 16 (GH-31191)
    25db2b3

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant