This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author brandtbucher
Recipients brandtbucher
Date 2019-10-10.19:16:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1570734964.11.0.75032337331.issue38436@roundup.psfhosted.org>
In-reply-to
Content
The attached PR adds a fast path for BINARY_ADD instructions involving two lists, where the left list has a refcount of exactly 1.

In this case, we instead do a PySequence_InPlaceConcat operation. This has the affect of avoiding quadratic complexity for list summations, by keeping only one intermediate result and extending it in-place.

For (potentially large) lists:

    a + b + c + d ...

Currently executed as:

    tmp = a + b
    tmp = tmp + c
    tmp = tmp + d
    ...

With this change:

    tmp = a + b
    tmp += c
    tmp += d
    ...

Here are the pyperformance results (optimizations enabled):

Slower (6):
- pickle_list: 3.63 us +- 0.08 us -> 3.78 us +- 0.10 us: 1.04x slower (+4%)
- xml_etree_generate: 102 ms +- 2 ms -> 104 ms +- 3 ms: 1.02x slower (+2%)
- xml_etree_parse: 166 ms +- 4 ms -> 169 ms +- 9 ms: 1.02x slower (+2%)
- scimark_sparse_mat_mult: 5.42 ms +- 0.11 ms -> 5.47 ms +- 0.16 ms: 1.01x slower (+1%)
- pickle_dict: 23.6 us +- 0.8 us -> 23.8 us +- 0.4 us: 1.01x slower (+1%)
- scimark_fft: 413 ms +- 6 ms -> 416 ms +- 6 ms: 1.01x slower (+1%)

Faster (25):
- 2to3: 394 ms +- 18 ms -> 381 ms +- 8 ms: 1.03x faster (-3%)
- python_startup_no_site: 12.0 ms +- 0.2 ms -> 11.7 ms +- 0.2 ms: 1.03x faster (-3%)
- nqueens: 121 ms +- 2 ms -> 118 ms +- 3 ms: 1.03x faster (-2%)
- pathlib: 36.0 ms +- 1.3 ms -> 35.1 ms +- 0.9 ms: 1.02x faster (-2%)
- pickle_pure_python: 523 us +- 15 us -> 511 us +- 13 us: 1.02x faster (-2%)
- deltablue: 8.54 ms +- 0.25 ms -> 8.35 ms +- 0.27 ms: 1.02x faster (-2%)
- logging_silent: 220 ns +- 6 ns -> 215 ns +- 8 ns: 1.02x faster (-2%)
- raytrace: 566 ms +- 8 ms -> 554 ms +- 11 ms: 1.02x faster (-2%)
- hexiom: 11.2 ms +- 0.3 ms -> 11.0 ms +- 0.2 ms: 1.02x faster (-2%)
- go: 294 ms +- 6 ms -> 288 ms +- 5 ms: 1.02x faster (-2%)
- python_startup: 15.7 ms +- 0.4 ms -> 15.4 ms +- 0.5 ms: 1.02x faster (-2%)
- float: 136 ms +- 4 ms -> 134 ms +- 6 ms: 1.02x faster (-2%)
- scimark_sor: 232 ms +- 8 ms -> 228 ms +- 5 ms: 1.02x faster (-2%)
- fannkuch: 554 ms +- 8 ms -> 545 ms +- 9 ms: 1.02x faster (-2%)
- meteor_contest: 115 ms +- 2 ms -> 113 ms +- 3 ms: 1.02x faster (-2%)
- telco: 6.58 ms +- 0.22 ms -> 6.48 ms +- 0.15 ms: 1.02x faster (-2%)
- regex_effbot: 3.76 ms +- 0.08 ms -> 3.70 ms +- 0.09 ms: 1.02x faster (-2%)
- chaos: 129 ms +- 3 ms -> 127 ms +- 2 ms: 1.01x faster (-1%)
- regex_v8: 26.3 ms +- 0.4 ms -> 25.9 ms +- 0.6 ms: 1.01x faster (-1%)
- regex_compile: 200 ms +- 5 ms -> 197 ms +- 5 ms: 1.01x faster (-1%)
- logging_simple: 9.78 us +- 0.19 us -> 9.65 us +- 0.26 us: 1.01x faster (-1%)
- sqlite_synth: 3.24 us +- 0.10 us -> 3.20 us +- 0.06 us: 1.01x faster (-1%)
- unpickle: 16.4 us +- 0.5 us -> 16.2 us +- 0.5 us: 1.01x faster (-1%)
- logging_format: 10.8 us +- 0.2 us -> 10.7 us +- 0.3 us: 1.01x faster (-1%)
- json_loads: 30.2 us +- 0.8 us -> 29.9 us +- 0.6 us: 1.01x faster (-1%)

Benchmark hidden because not significant (14): json_dumps, nbody, pickle, pidigits, regex_dna, richards, scimark_lu, scimark_monte_carlo, spectral_norm, unpack_sequence, unpickle_list, unpickle_pure_python, xml_etree_iterparse, xml_etree_process

This is related to my earlier work in bpo-36229, however this is a much less invasive change that's limited to ceval.c, where our knowledge of context is much better.
History
Date User Action Args
2019-10-10 19:16:04brandtbuchersetrecipients: + brandtbucher
2019-10-10 19:16:04brandtbuchersetmessageid: <1570734964.11.0.75032337331.issue38436@roundup.psfhosted.org>
2019-10-10 19:16:04brandtbucherlinkissue38436 messages
2019-10-10 19:16:03brandtbuchercreate