Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type-specialized Py_DECREF #90667

Closed
sweeneyde opened this issue Jan 25, 2022 · 2 comments
Closed

Type-specialized Py_DECREF #90667

sweeneyde opened this issue Jan 25, 2022 · 2 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@sweeneyde
Copy link
Member

BPO 46509
Nosy @sweeneyde
PRs
  • gh-90667: Add specializations of Py_DECREF when types are known #30872
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-01-25.05:27:08.416>
    labels = ['interpreter-core', '3.11', 'performance']
    title = 'Type-specialized Py_DECREF'
    updated_at = <Date 2022-01-25.05:33:34.410>
    user = 'https://github.com/sweeneyde'

    bugs.python.org fields:

    activity = <Date 2022-01-25.05:33:34.410>
    actor = 'Dennis Sweeney'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2022-01-25.05:27:08.416>
    creator = 'Dennis Sweeney'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46509
    keywords = ['patch']
    message_count = 2.0
    messages = ['411553', '411554']
    nosy_count = 1.0
    nosy_names = ['Dennis Sweeney']
    pr_nums = ['30872']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue46509'
    versions = ['Python 3.11']

    @sweeneyde
    Copy link
    Member Author

    GCC --enable-optimizations --with-lto on WSL:

    Slower (7):

    • json_dumps: 12.8 ms +- 0.2 ms -> 13.1 ms +- 0.3 ms: 1.02x slower
    • meteor_contest: 106 ms +- 2 ms -> 109 ms +- 3 ms: 1.02x slower
    • telco: 6.50 ms +- 0.17 ms -> 6.58 ms +- 0.18 ms: 1.01x slower
    • fannkuch: 383 ms +- 5 ms -> 388 ms +- 8 ms: 1.01x slower
    • regex_compile: 143 ms +- 2 ms -> 145 ms +- 4 ms: 1.01x slower
    • mako: 11.1 ms +- 0.1 ms -> 11.2 ms +- 0.2 ms: 1.01x slower
    • chameleon: 7.08 ms +- 0.07 ms -> 7.12 ms +- 0.10 ms: 1.01x slower

    Faster (27):

    • unpack_sequence: 45.9 ns +- 1.1 ns -> 41.6 ns +- 1.0 ns: 1.10x faster
    • logging_silent: 108 ns +- 11 ns -> 101 ns +- 3 ns: 1.06x faster
    • nbody: 95.6 ms +- 3.2 ms -> 90.2 ms +- 1.9 ms: 1.06x faster
    • spectral_norm: 98.3 ms +- 2.3 ms -> 92.8 ms +- 1.6 ms: 1.06x faster
    • regex_dna: 202 ms +- 3 ms -> 194 ms +- 3 ms: 1.04x faster
    • scimark_fft: 342 ms +- 12 ms -> 331 ms +- 7 ms: 1.03x faster
    • crypto_pyaes: 89.6 ms +- 1.7 ms -> 86.8 ms +- 1.1 ms: 1.03x faster
    • json_loads: 27.4 us +- 0.9 us -> 26.5 us +- 1.3 us: 1.03x faster
    • scimark_monte_carlo: 69.3 ms +- 1.5 ms -> 67.3 ms +- 1.2 ms: 1.03x faster
    • pickle_list: 4.62 us +- 0.21 us -> 4.51 us +- 0.15 us: 1.02x faster
    • scimark_sparse_mat_mult: 5.14 ms +- 0.21 ms -> 5.02 ms +- 0.18 ms: 1.02x faster
    • xml_etree_parse: 161 ms +- 5 ms -> 157 ms +- 6 ms: 1.02x faster
    • regex_effbot: 3.07 ms +- 0.05 ms -> 3.00 ms +- 0.07 ms: 1.02x faster
    • deltablue: 4.36 ms +- 0.14 ms -> 4.27 ms +- 0.14 ms: 1.02x faster
    • pickle_pure_python: 343 us +- 6 us -> 335 us +- 8 us: 1.02x faster
    • sqlite_synth: 2.60 us +- 0.06 us -> 2.55 us +- 0.04 us: 1.02x faster
    • xml_etree_iterparse: 110 ms +- 2 ms -> 108 ms +- 2 ms: 1.02x faster
    • go: 146 ms +- 2 ms -> 143 ms +- 3 ms: 1.02x faster
    • pathlib: 20.2 ms +- 0.5 ms -> 19.8 ms +- 0.3 ms: 1.02x faster
    • scimark_sor: 117 ms +- 3 ms -> 115 ms +- 2 ms: 1.02x faster
    • dulwich_log: 80.9 ms +- 2.0 ms -> 79.6 ms +- 1.7 ms: 1.02x faster
    • nqueens: 84.4 ms +- 1.7 ms -> 83.1 ms +- 2.0 ms: 1.02x faster
    • python_startup: 8.84 ms +- 0.07 ms -> 8.76 ms +- 0.07 ms: 1.01x faster
    • 2to3: 269 ms +- 4 ms -> 266 ms +- 3 ms: 1.01x faster
    • float: 77.0 ms +- 1.2 ms -> 76.5 ms +- 1.5 ms: 1.01x faster
    • sympy_integrate: 22.7 ms +- 0.3 ms -> 22.5 ms +- 0.2 ms: 1.01x faster
    • xml_etree_process: 55.7 ms +- 0.7 ms -> 55.4 ms +- 0.6 ms: 1.01x faster

    Benchmark hidden because not significant (24): chaos, django_template, hexiom, logging_format, logging_simple, pickle, pickle_dict, pidigits, pyflate, python_startup_no_site, raytrace, regex_v8, richards, scimark_lu, sqlalchemy_declarative, sqlalchemy_imperative, sympy_expand, sympy_sum, sympy_str, tornado_http, unpickle, unpickle_list, unpickle_pure_python, xml_etree_generate

    Geometric mean: 1.01x faster

    @sweeneyde sweeneyde added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Jan 25, 2022
    @sweeneyde
    Copy link
    Member Author

    This attempts to avoid the dispatch dance of

    Py_DECREF(op) :: Py_TYPE(op)->tp_dealloc(op) :: Py_TYPE(op)->tp_free((PyObject *)op);

    I suspect this earns the most speedup from floats, where freelist manipulation can be inlined. This might make a single-digit-int freelist more impactful.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant