Message 339699 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	anthony shaw
Recipients	Aaron Hall, anthony shaw, methane, ncoghlan, pablogsal, ronaldoussoren, serhiy.storchaka
Date	2019-04-09.05:35:20
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1554788120.85.0.549423344031.issue36551@roundup.psfhosted.org>
In-reply-to

Content
Have just optimized some of the code and pushed another change as 69dce1c552. ran both master and 69dce1c552 using pyperformance with PGO: ➜ ~ python3.8 -m perf compare_to master.json 69dce1c552.json --table +-------------------------+---------+-----------------------------+ \| Benchmark \| master \| 69dce1c552 \| +=========================+=========+=============================+ \| 2to3 \| 432 ms \| 426 ms: 1.02x faster (-2%) \| +-------------------------+---------+-----------------------------+ \| chaos \| 157 ms \| 155 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| crypto_pyaes \| 154 ms \| 153 ms: 1.00x faster (-0%) \| +-------------------------+---------+-----------------------------+ \| dulwich_log \| 123 ms \| 124 ms: 1.00x slower (+0%) \| +-------------------------+---------+-----------------------------+ \| fannkuch \| 603 ms \| 600 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| float \| 153 ms \| 154 ms: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| go \| 323 ms \| 326 ms: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| hexiom \| 13.6 ms \| 13.5 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| json_dumps \| 18.1 ms \| 17.9 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| logging_format \| 13.2 us \| 13.8 us: 1.05x slower (+5%) \| +-------------------------+---------+-----------------------------+ \| logging_silent \| 266 ns \| 280 ns: 1.05x slower (+5%) \| +-------------------------+---------+-----------------------------+ \| logging_simple \| 12.4 us \| 13.1 us: 1.06x slower (+6%) \| +-------------------------+---------+-----------------------------+ \| meteor_contest \| 145 ms \| 132 ms: 1.10x faster (-9%) \| +-------------------------+---------+-----------------------------+ \| nbody \| 179 ms \| 172 ms: 1.04x faster (-4%) \| +-------------------------+---------+-----------------------------+ \| nqueens \| 138 ms \| 134 ms: 1.03x faster (-3%) \| +-------------------------+---------+-----------------------------+ \| pathlib \| 56.4 ms \| 55.6 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| pickle \| 15.0 us \| 15.4 us: 1.03x slower (+3%) \| +-------------------------+---------+-----------------------------+ \| pickle_pure_python \| 620 us \| 617 us: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| raytrace \| 696 ms \| 691 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| regex_compile \| 242 ms \| 243 ms: 1.00x slower (+0%) \| +-------------------------+---------+-----------------------------+ \| scimark_monte_carlo \| 140 ms \| 143 ms: 1.02x slower (+2%) \| +-------------------------+---------+-----------------------------+ \| scimark_sparse_mat_mult \| 5.90 ms \| 5.94 ms: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| spectral_norm \| 194 ms \| 196 ms: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| sympy_str \| 246 ms \| 245 ms: 1.00x faster (-0%) \| +-------------------------+---------+-----------------------------+ \| telco \| 8.42 ms \| 8.31 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| unpack_sequence \| 59.2 ns \| 59.7 ns: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| unpickle \| 21.2 us \| 21.4 us: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| unpickle_list \| 5.73 us \| 5.81 us: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| unpickle_pure_python \| 471 us \| 467 us: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ \| xml_etree_iterparse \| 142 ms \| 143 ms: 1.01x slower (+1%) \| +-------------------------+---------+-----------------------------+ \| xml_etree_generate \| 139 ms \| 137 ms: 1.02x faster (-2%) \| +-------------------------+---------+-----------------------------+ \| xml_etree_process \| 109 ms \| 108 ms: 1.01x faster (-1%) \| +-------------------------+---------+-----------------------------+ Not significant (21): deltablue; django_template; html5lib; json_loads; mako; pickle_dict; pickle_list; pidigits; python_startup; python_startup_no_site; regex_dna; regex_effbot; regex_v8; richards; scimark_fft; scimark_lu; scimark_sor; sympy_expand; sympy_integrate; sympy_sum; xml_etree_parse I'd like to look at the way range object LengthHint works, it looks like the path for those is not ideal and could use some optimization. Also, BUILD_LIST_PREALLOC uses the Iterator, not the actual object, so you can't use the much faster _HasLen and PyObject_Length(). I'm going to look at how __length_hint__ could be optimized for iterators that would make the smaller range cases more efficient. meteor_contest uses a lot of list comprehensions, so should show the impact for the patch.

Have just optimized some of the code and pushed another change as 69dce1c552.

ran both master and 69dce1c552 using pyperformance with PGO:

➜  ~ python3.8 -m perf compare_to master.json 69dce1c552.json --table 
+-------------------------+---------+-----------------------------+
| Benchmark               | master  | 69dce1c552                  |
+=========================+=========+=============================+
| 2to3                    | 432 ms  | 426 ms: 1.02x faster (-2%)  |
+-------------------------+---------+-----------------------------+
| chaos                   | 157 ms  | 155 ms: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+
| crypto_pyaes            | 154 ms  | 153 ms: 1.00x faster (-0%)  |
+-------------------------+---------+-----------------------------+
| dulwich_log             | 123 ms  | 124 ms: 1.00x slower (+0%)  |
+-------------------------+---------+-----------------------------+
| fannkuch                | 603 ms  | 600 ms: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+
| float                   | 153 ms  | 154 ms: 1.01x slower (+1%)  |
+-------------------------+---------+-----------------------------+
| go                      | 323 ms  | 326 ms: 1.01x slower (+1%)  |
+-------------------------+---------+-----------------------------+
| hexiom                  | 13.6 ms | 13.5 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| json_dumps              | 18.1 ms | 17.9 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| logging_format          | 13.2 us | 13.8 us: 1.05x slower (+5%) |
+-------------------------+---------+-----------------------------+
| logging_silent          | 266 ns  | 280 ns: 1.05x slower (+5%)  |
+-------------------------+---------+-----------------------------+
| logging_simple          | 12.4 us | 13.1 us: 1.06x slower (+6%) |
+-------------------------+---------+-----------------------------+
| meteor_contest          | 145 ms  | 132 ms: 1.10x faster (-9%)  |
+-------------------------+---------+-----------------------------+
| nbody                   | 179 ms  | 172 ms: 1.04x faster (-4%)  |
+-------------------------+---------+-----------------------------+
| nqueens                 | 138 ms  | 134 ms: 1.03x faster (-3%)  |
+-------------------------+---------+-----------------------------+
| pathlib                 | 56.4 ms | 55.6 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| pickle                  | 15.0 us | 15.4 us: 1.03x slower (+3%) |
+-------------------------+---------+-----------------------------+
| pickle_pure_python      | 620 us  | 617 us: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+
| raytrace                | 696 ms  | 691 ms: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+
| regex_compile           | 242 ms  | 243 ms: 1.00x slower (+0%)  |
+-------------------------+---------+-----------------------------+
| scimark_monte_carlo     | 140 ms  | 143 ms: 1.02x slower (+2%)  |
+-------------------------+---------+-----------------------------+
| scimark_sparse_mat_mult | 5.90 ms | 5.94 ms: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| spectral_norm           | 194 ms  | 196 ms: 1.01x slower (+1%)  |
+-------------------------+---------+-----------------------------+
| sympy_str               | 246 ms  | 245 ms: 1.00x faster (-0%)  |
+-------------------------+---------+-----------------------------+
| telco                   | 8.42 ms | 8.31 ms: 1.01x faster (-1%) |
+-------------------------+---------+-----------------------------+
| unpack_sequence         | 59.2 ns | 59.7 ns: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle                | 21.2 us | 21.4 us: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle_list           | 5.73 us | 5.81 us: 1.01x slower (+1%) |
+-------------------------+---------+-----------------------------+
| unpickle_pure_python    | 471 us  | 467 us: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+
| xml_etree_iterparse     | 142 ms  | 143 ms: 1.01x slower (+1%)  |
+-------------------------+---------+-----------------------------+
| xml_etree_generate      | 139 ms  | 137 ms: 1.02x faster (-2%)  |
+-------------------------+---------+-----------------------------+
| xml_etree_process       | 109 ms  | 108 ms: 1.01x faster (-1%)  |
+-------------------------+---------+-----------------------------+

Not significant (21): deltablue; django_template; html5lib; json_loads; mako; pickle_dict; pickle_list; pidigits; python_startup; python_startup_no_site; regex_dna; regex_effbot; regex_v8; richards; scimark_fft; scimark_lu; scimark_sor; sympy_expand; sympy_integrate; sympy_sum; xml_etree_parse

I'd like to look at the way range object LengthHint works, it looks like the path for those is not ideal and could use some optimization. Also, BUILD_LIST_PREALLOC uses the Iterator, not the actual object, so you can't use the much faster _HasLen and PyObject_Length().

I'm going to look at how __length_hint__ could be optimized for iterators that would make the smaller range cases more efficient.

meteor_contest uses a lot of list comprehensions, so should show the impact for the patch.

History
Date	User	Action	Args
2019-04-09 05:35:20	anthony shaw	set	recipients: + anthony shaw, ronaldoussoren, ncoghlan, methane, serhiy.storchaka, Aaron Hall, pablogsal
2019-04-09 05:35:20	anthony shaw	set	messageid: <1554788120.85.0.549423344031.issue36551@roundup.psfhosted.org>
2019-04-09 05:35:20	anthony shaw	link	issue36551 messages
2019-04-09 05:35:20	anthony shaw	create