Author inada.naoki
Recipients inada.naoki, jdemeyer
Date 2019-06-19.12:54:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1560948861.51.0.564500331945.issue37340@roundup.psfhosted.org>
In-reply-to
Content
LOAD_METHOD avoids temporary bound method object.
PyObject_CallMethodObjArgs now use same optimization.

Now I think there is not enough performance benefit from free_list.
When free_list is not used often enough, it may bother obmalloc reuse memory pool.

This is performance diff of removing free_list (with LTO, without PGO, patched=removed free_list):

```
$ ./python -m pyperf compare_to master.json patched.json -G --min-speed=1
Slower (19):
- sqlite_synth: 4.03 us +- 0.10 us -> 4.20 us +- 0.08 us: 1.04x slower (+4%)
- genshi_text: 41.2 ms +- 0.4 ms -> 42.6 ms +- 0.4 ms: 1.03x slower (+3%)
- scimark_sparse_mat_mult: 6.29 ms +- 0.03 ms -> 6.50 ms +- 0.50 ms: 1.03x slower (+3%)
- mako: 26.5 ms +- 0.1 ms -> 27.4 ms +- 0.3 ms: 1.03x slower (+3%)
- html5lib: 130 ms +- 4 ms -> 134 ms +- 5 ms: 1.03x slower (+3%)
- genshi_xml: 83.4 ms +- 1.1 ms -> 85.6 ms +- 1.2 ms: 1.03x slower (+3%)
- pickle: 15.1 us +- 0.5 us -> 15.5 us +- 0.5 us: 1.03x slower (+3%)
- float: 161 ms +- 1 ms -> 165 ms +- 1 ms: 1.02x slower (+2%)
- logging_simple: 13.9 us +- 0.2 us -> 14.2 us +- 0.2 us: 1.02x slower (+2%)
- xml_etree_process: 108 ms +- 1 ms -> 110 ms +- 1 ms: 1.02x slower (+2%)
- pathlib: 28.0 ms +- 0.2 ms -> 28.5 ms +- 0.3 ms: 1.02x slower (+2%)
- pickle_pure_python: 703 us +- 8 us -> 715 us +- 7 us: 1.02x slower (+2%)
- sympy_expand: 553 ms +- 5 ms -> 563 ms +- 12 ms: 1.02x slower (+2%)
- xml_etree_generate: 136 ms +- 2 ms -> 138 ms +- 1 ms: 1.02x slower (+2%)
- logging_format: 15.3 us +- 0.2 us -> 15.5 us +- 0.2 us: 1.01x slower (+1%)
- json_dumps: 17.4 ms +- 0.1 ms -> 17.7 ms +- 0.2 ms: 1.01x slower (+1%)
- logging_silent: 266 ns +- 5 ns -> 269 ns +- 9 ns: 1.01x slower (+1%)
- django_template: 163 ms +- 1 ms -> 165 ms +- 2 ms: 1.01x slower (+1%)
- sympy_sum: 219 ms +- 2 ms -> 222 ms +- 2 ms: 1.01x slower (+1%)

Faster (6):
- regex_effbot: 4.51 ms +- 0.04 ms -> 4.44 ms +- 0.03 ms: 1.02x faster (-2%)
- pickle_list: 5.21 us +- 0.04 us -> 5.13 us +- 0.04 us: 1.01x faster (-1%)
- crypto_pyaes: 164 ms +- 1 ms -> 162 ms +- 1 ms: 1.01x faster (-1%)
- xml_etree_parse: 202 ms +- 7 ms -> 200 ms +- 3 ms: 1.01x faster (-1%)
- scimark_sor: 287 ms +- 6 ms -> 284 ms +- 6 ms: 1.01x faster (-1%)
- raytrace: 758 ms +- 26 ms -> 750 ms +- 11 ms: 1.01x faster (-1%)

Benchmark hidden because not significant (35)
```

I think free_list is useful only when several benchmarks in pyperformance shows more than 5% speedup.
The benefit is smaller than my threshold.  I will run pyperformance again after bpo-37337 is merged.

FWIW, In case of sqlite_synth, I think performance difference came from here:
https://github.com/python/cpython/blob/015000165373f8db263ef5bc682f02d74e5782ac/Modules/_sqlite/connection.c#L662
If performance of user-defined aggregate feature is really important, we can optimize it further.
History
Date User Action Args
2019-06-19 12:54:21inada.naokisetrecipients: + inada.naoki, jdemeyer
2019-06-19 12:54:21inada.naokisetmessageid: <1560948861.51.0.564500331945.issue37340@roundup.psfhosted.org>
2019-06-19 12:54:21inada.naokilinkissue37340 messages
2019-06-19 12:54:21inada.naokicreate