This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: int/float specializations should mutate the LHS in-place when possible
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: brandtbucher Nosy List: Mark.Shannon, brandtbucher, gvanrossum, mark.dickinson
Priority: normal Keywords: patch

Created on 2022-01-14 06:42 by brandtbucher, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 30594 open brandtbucher, 2022-01-14 06:44
Messages (2)
msg410544 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2022-01-14 06:42
The performance of our existing int and float specializations can be improved by mutating the LHS operand in-place when possible. This leads to significant speedups for several number-crunching benchmarks, and a solid 1% improvement overall:

Slower (16):
- regex_effbot: 3.14 ms +- 0.01 ms -> 3.26 ms +- 0.03 ms: 1.04x slower
- pidigits: 197 ms +- 0 ms -> 203 ms +- 0 ms: 1.03x slower
- pickle_list: 4.40 us +- 0.05 us -> 4.51 us +- 0.05 us: 1.02x slower
- logging_silent: 106 ns +- 2 ns -> 108 ns +- 1 ns: 1.02x slower
- unpickle_pure_python: 248 us +- 2 us -> 253 us +- 4 us: 1.02x slower
- xml_etree_generate: 80.3 ms +- 0.5 ms -> 81.5 ms +- 0.7 ms: 1.02x slower
- telco: 6.50 ms +- 0.10 ms -> 6.60 ms +- 0.11 ms: 1.02x slower
- go: 149 ms +- 1 ms -> 151 ms +- 2 ms: 1.01x slower
- pickle: 9.82 us +- 0.07 us -> 9.94 us +- 0.13 us: 1.01x slower
- xml_etree_process: 58.0 ms +- 0.6 ms -> 58.6 ms +- 0.5 ms: 1.01x slower
- pickle_pure_python: 329 us +- 5 us -> 332 us +- 2 us: 1.01x slower
- regex_dna: 217 ms +- 3 ms -> 219 ms +- 0 ms: 1.01x slower
- json_loads: 25.3 us +- 0.2 us -> 25.6 us +- 0.3 us: 1.01x slower
- scimark_fft: 328 ms +- 9 ms -> 331 ms +- 5 ms: 1.01x slower
- 2to3: 263 ms +- 1 ms -> 264 ms +- 1 ms: 1.01x slower
- deltablue: 4.20 ms +- 0.04 ms -> 4.22 ms +- 0.03 ms: 1.00x slower

Faster (24):
- scimark_sparse_mat_mult: 4.82 ms +- 0.20 ms -> 4.31 ms +- 0.37 ms: 1.12x faster
- spectral_norm: 97.0 ms +- 0.8 ms -> 89.3 ms +- 0.6 ms: 1.09x faster
- fannkuch: 418 ms +- 7 ms -> 385 ms +- 4 ms: 1.08x faster
- unpack_sequence: 48.6 ns +- 2.6 ns -> 46.1 ns +- 3.5 ns: 1.05x faster
- scimark_lu: 115 ms +- 4 ms -> 110 ms +- 2 ms: 1.05x faster
- scimark_monte_carlo: 72.2 ms +- 1.1 ms -> 69.9 ms +- 0.8 ms: 1.03x faster
- nbody: 99.4 ms +- 2.1 ms -> 96.9 ms +- 1.7 ms: 1.03x faster
- chaos: 72.5 ms +- 0.7 ms -> 70.9 ms +- 0.5 ms: 1.02x faster
- nqueens: 84.6 ms +- 0.7 ms -> 82.8 ms +- 0.5 ms: 1.02x faster
- pickle_dict: 27.1 us +- 0.1 us -> 26.7 us +- 0.1 us: 1.02x faster
- regex_v8: 24.3 ms +- 0.4 ms -> 24.0 ms +- 0.4 ms: 1.01x faster
- sqlalchemy_imperative: 19.1 ms +- 0.7 ms -> 18.8 ms +- 0.2 ms: 1.01x faster
- float: 77.4 ms +- 0.9 ms -> 76.7 ms +- 0.9 ms: 1.01x faster
- sqlalchemy_declarative: 147 ms +- 3 ms -> 146 ms +- 3 ms: 1.01x faster
- hexiom: 6.68 ms +- 0.06 ms -> 6.63 ms +- 0.03 ms: 1.01x faster
- sympy_sum: 169 ms +- 2 ms -> 168 ms +- 2 ms: 1.01x faster
- json_dumps: 12.8 ms +- 0.2 ms -> 12.7 ms +- 0.2 ms: 1.01x faster
- logging_format: 6.42 us +- 0.08 us -> 6.37 us +- 0.09 us: 1.01x faster
- python_startup_no_site: 5.81 ms +- 0.00 ms -> 5.77 ms +- 0.00 ms: 1.01x faster
- sympy_integrate: 21.5 ms +- 0.1 ms -> 21.4 ms +- 0.1 ms: 1.01x faster
- dulwich_log: 65.4 ms +- 0.5 ms -> 65.1 ms +- 0.5 ms: 1.00x faster
- crypto_pyaes: 83.5 ms +- 0.5 ms -> 83.1 ms +- 0.4 ms: 1.00x faster
- raytrace: 309 ms +- 3 ms -> 307 ms +- 2 ms: 1.00x faster
- python_startup: 8.18 ms +- 0.01 ms -> 8.15 ms +- 0.01 ms: 1.00x faster

Benchmark hidden because not significant (18): chameleon, django_template, logging_simple, mako, meteor_contest, pathlib, pyflate, regex_compile, richards, scimark_sor, sqlite_synth, sympy_expand, sympy_str, tornado_http, unpickle, unpickle_list, xml_etree_parse, xml_etree_iterparse

Geometric mean: 1.01x faster
msg412215 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-31 18:57
Since we decided to wait on the int operations while longobject.c is being refactored (https://github.com/faster-cpython/ideas/issues/245), can you clarify whether the speedup reported is from code where it is implemented only for floats, or is it from a prototype where it's implemented for floats and ints?
History
Date User Action Args
2022-04-11 14:59:54adminsetgithub: 90530
2022-01-31 18:57:11gvanrossumsetmessages: + msg412215
2022-01-16 13:35:42mark.dickinsonsetnosy: + mark.dickinson
2022-01-16 11:14:18vstinnersetnosy: - vstinner
2022-01-15 21:17:32rhettingersetnosy: + vstinner
2022-01-14 06:44:09brandtbuchersetkeywords: + patch
stage: patch review
pull_requests: + pull_request28792
2022-01-14 06:42:28brandtbuchercreate