Message339699
Have just optimized some of the code and pushed another change as 69dce1c552.
ran both master and 69dce1c552 using pyperformance with PGO:
➜ ~ python3.8 -m perf compare_to master.json 69dce1c552.json --table
+-------------------------+---------+-----------------------------+
| Benchmark | master | 69dce1c552 |
+=========================+=========+=============================+
| 2to3 | 432 ms | 426 ms: 1.02x faster (-2%) |
+-------------------------+---------+-----------------------------+
| chaos | 157 ms | 155 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| crypto_pyaes | 154 ms | 153 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| dulwich_log | 123 ms | 124 ms: 1.00x slower (+0%) |
+-------------------------+---------+-----------------------------+
| fannkuch | 603 ms | 600 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| float | 153 ms | 154 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| go | 323 ms | 326 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| hexiom | 13.6 ms | 13.5 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| json_dumps | 18.1 ms | 17.9 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| logging_format | 13.2 us | 13.8 us: 1.05x slower (+5%) |
+-------------------------+---------+-----------------------------+
| logging_silent | 266 ns | 280 ns: 1.05x slower (+5%) |
+-------------------------+---------+-----------------------------+
| logging_simple | 12.4 us | 13.1 us: 1.06x slower (+6%) |
+-------------------------+---------+-----------------------------+
| meteor_contest | 145 ms | 132 ms: 1.10x faster (-9%) |
+-------------------------+---------+-----------------------------+
| nbody | 179 ms | 172 ms: 1.04x faster (-4%) |
+-------------------------+---------+-----------------------------+
| nqueens | 138 ms | 134 ms: 1.03x faster (-3%) |
+-------------------------+---------+-----------------------------+
| pathlib | 56.4 ms | 55.6 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| pickle | 15.0 us | 15.4 us: 1.03x slower (+3%) |
+-------------------------+---------+-----------------------------+
| pickle_pure_python | 620 us | 617 us: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| raytrace | 696 ms | 691 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| regex_compile | 242 ms | 243 ms: 1.00x slower (+0%) |
+-------------------------+---------+-----------------------------+
| scimark_monte_carlo | 140 ms | 143 ms: 1.02x slower (+2%) |
+-------------------------+---------+-----------------------------+
| scimark_sparse_mat_mult | 5.90 ms | 5.94 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| spectral_norm | 194 ms | 196 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| sympy_str | 246 ms | 245 ms: 1.00x faster (-0%) |
+-------------------------+---------+-----------------------------+
| telco | 8.42 ms | 8.31 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| unpack_sequence | 59.2 ns | 59.7 ns: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle | 21.2 us | 21.4 us: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle_list | 5.73 us | 5.81 us: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle_pure_python | 471 us | 467 us: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| xml_etree_iterparse | 142 ms | 143 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| xml_etree_generate | 139 ms | 137 ms: 1.02x faster (-2%) |
+-------------------------+---------+-----------------------------+
| xml_etree_process | 109 ms | 108 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
Not significant (21): deltablue; django_template; html5lib; json_loads; mako; pickle_dict; pickle_list; pidigits; python_startup; python_startup_no_site; regex_dna; regex_effbot; regex_v8; richards; scimark_fft; scimark_lu; scimark_sor; sympy_expand; sympy_integrate; sympy_sum; xml_etree_parse
I'd like to look at the way range object LengthHint works, it looks like the path for those is not ideal and could use some optimization. Also, BUILD_LIST_PREALLOC uses the Iterator, not the actual object, so you can't use the much faster _HasLen and PyObject_Length().
I'm going to look at how __length_hint__ could be optimized for iterators that would make the smaller range cases more efficient.
meteor_contest uses a lot of list comprehensions, so should show the impact for the patch. |
|
Date |
User |
Action |
Args |
2019-04-09 05:35:20 | anthony shaw | set | recipients:
+ anthony shaw, ronaldoussoren, ncoghlan, methane, serhiy.storchaka, Aaron Hall, pablogsal |
2019-04-09 05:35:20 | anthony shaw | set | messageid: <1554788120.85.0.549423344031.issue36551@roundup.psfhosted.org> |
2019-04-09 05:35:20 | anthony shaw | link | issue36551 messages |
2019-04-09 05:35:20 | anthony shaw | create | |
|