Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize dict_merge for copy #85603

Closed
methane opened this issue Jul 29, 2020 · 5 comments
Closed

Optimize dict_merge for copy #85603

methane opened this issue Jul 29, 2020 · 5 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@methane
Copy link
Member

methane commented Jul 29, 2020

BPO 41431
Nosy @methane, @serhiy-storchaka, @1st1
PRs
  • bpo-41431: Optimize dict_merge for copy #21669
  • bpo-41431: Optimize dict_merge for copy #21674
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-08-04.02:08:38.234>
    created_at = <Date 2020-07-29.00:29:38.664>
    labels = ['interpreter-core', '3.10', 'performance']
    title = 'Optimize dict_merge for copy'
    updated_at = <Date 2020-08-04.02:08:38.233>
    user = 'https://github.com/methane'

    bugs.python.org fields:

    activity = <Date 2020-08-04.02:08:38.233>
    actor = 'methane'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-08-04.02:08:38.234>
    closer = 'methane'
    components = ['Interpreter Core']
    creation = <Date 2020-07-29.00:29:38.664>
    creator = 'methane'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41431
    keywords = ['patch']
    message_count = 5.0
    messages = ['374550', '374551', '374555', '374556', '374784']
    nosy_count = 3.0
    nosy_names = ['methane', 'serhiy.storchaka', 'yselivanov']
    pr_nums = ['21669', '21674']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue41431'
    versions = ['Python 3.10']

    @methane
    Copy link
    Member Author

    methane commented Jul 29, 2020

    Although there are dict.copy() and PyDict_Copy(), dict_merge can be used for copying dict.

    • d={}; d.update(orig)
    • d=dict(orig)
    • d=orig.copy() # orig has many dummy keys.

    @methane methane added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jul 29, 2020
    @methane
    Copy link
    Member Author

    methane commented Jul 29, 2020

    Microbenchmark for commit cf4f61c.

    # timeit -s 'd=dict.fromkeys(range(8))' -- 'dict(d)'
    Mean +- std dev: [master] 311 ns +- 2 ns -> [patched] 144 ns +- 1 ns: 2.16x faster (-54%)

    # timeit -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
    Mean +- std dev: [master] 21.6 us +- 0.2 us -> [patched] 7.67 us +- 0.09 us: 2.81x faster (-64%)

    # timeit -s 'd=dict.fromkeys(range(8))' -- '{}.update(d)'
    Mean +- std dev: [master] 301 ns +- 5 ns -> [patched] 149 ns +- 1 ns: 2.01x faster (-50%)

    # timeit -s 'd=dict.fromkeys(range(1000))' -- '{}.update(d)'
    Mean +- std dev: [master] 21.4 us +- 0.2 us -> [patched] 7.64 us +- 0.07 us: 2.80x faster (-64%)

    @methane
    Copy link
    Member Author

    methane commented Jul 29, 2020

    To reduce code size, I am considering to remove clone_combined_dict. I will check how PyDict_Copy() is performance critical.

    This is microbenchmark result of d.copy() and dict(d).

    $ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'd.copy()'
    python-master: ..................... 4.36 us +- 0.07 us
    python: ..................... 5.96 us +- 0.10 us

    Mean +- std dev: [python-master] 4.36 us +- 0.07 us -> [python] 5.96 us +- 0.10 us: 1.37x slower (+37%)

    $ ./python -m pyperf timeit --compare-to ./python-master -s 'd=dict.fromkeys(range(1000))' -- 'dict(d)'
    python-master: ..................... 21.6 us +- 0.2 us
    python: ..................... 6.01 us +- 0.09 us

    Mean +- std dev: [python-master] 21.6 us +- 0.2 us -> [python] 6.01 us +- 0.09 us: 3.59x faster (-72%)

    @methane
    Copy link
    Member Author

    methane commented Jul 29, 2020

    PyDict_Copy() is not used in eval loop or calling functions. So removing clone_combined_dict() is a considerable option.

    Another option is to use clone_combined_dict() in dict_merge, instead of adding dict_copy2().

    Pros: No performance regression. PyDict_Copy() is as fast as before.
    Cons: Can not "fast copy" split dict and dirty dict.

    I suppose most dict used by dict(d) or dict.update(d) is clean and combined. So I will implement the second option.

    @methane
    Copy link
    Member Author

    methane commented Aug 4, 2020

    New changeset db6d9a5 by Inada Naoki in branch 'master':
    bpo-41431: Optimize dict_merge for copy (GH-21674)
    db6d9a5

    @methane methane closed this as completed Aug 4, 2020
    @methane methane added the performance Performance or resource usage label Aug 4, 2020
    @methane methane closed this as completed Aug 4, 2020
    @methane methane added the performance Performance or resource usage label Aug 4, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant