Message 347649 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	nascheme
Recipients	gregory.p.smith, methane, nascheme, pablogsal, tianon
Date	2019-07-11.03:36:06
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1562816167.52.0.240635961399.issue36044@roundup.psfhosted.org>
In-reply-to

Content
> Decreasing the total wall time for a default --enable-optimizations build would > be a good thing for everyone, provided the resulting interpreter remains > "effectively similar" in speed. If you somehow manage to find something that > actually speeds up the resulting interpreter, amazing! I spent quite a lot of time making different PGO builds and comparing with pyperformance. The current PGO task is really slow. Just running the PROFILE_TASK takes 24 minutes on my decently fast PC. Using this set of tests seems to work pretty well: PROFILE_TASK=-m test.regrtest --pgo \ test_collections \ test_dataclasses \ test_difflib \ test_embed \ test_float \ test_functools \ test_generators \ test_int \ test_itertools \ test_json \ test_logging \ test_long \ test_ordered_dict \ test_pickle \ test_pprint \ test_re \ test_set \ test_statistics \ test_struct \ test_tabnanny \ test_xml_etree Instead of 24 minutes, the above task takes one and a half minutes. pyperformance results seem largely unchanged. Comparison below. Tuning the tests to get the best pyperformance result is a bit dangerous and perhaps running the whole test suite is safer (i.e. we are not optimizing for specific benchmarks). I didn't tweak the list too much. I added test_int, test_long, test_struct and test_itertools as a result of my pyperformance runs. Not too surprising those are important modules. I think the set of tests above should do a pretty good job of covering the hot code paths in most Python programs. So, maybe it is good enough given the massive speedup in build time. +-------------------------+----------+------------------------------+ \| Benchmark \| task-all \| task-short \| +=========================+==========+==============================+ \| 2to3 \| 311 ms \| 315 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| chaos \| 111 ms \| 108 ms: 1.02x faster (-2%) \| +-------------------------+----------+------------------------------+ \| crypto_pyaes \| 114 ms \| 112 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| dulwich_log \| 78.0 ms \| 78.7 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| fannkuch \| 470 ms \| 452 ms: 1.04x faster (-4%) \| +-------------------------+----------+------------------------------+ \| float \| 118 ms \| 117 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| go \| 253 ms \| 255 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| json_dumps \| 12.5 ms \| 11.8 ms: 1.06x faster (-6%) \| +-------------------------+----------+------------------------------+ \| json_loads \| 26.3 us \| 25.4 us: 1.04x faster (-3%) \| +-------------------------+----------+------------------------------+ \| logging_format \| 9.53 us \| 9.66 us: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| logging_silent \| 198 ns \| 196 ns: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| mako \| 15.2 ms \| 15.8 ms: 1.04x slower (+4%) \| +-------------------------+----------+------------------------------+ \| meteor_contest \| 98.2 ms \| 96.8 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| nbody \| 135 ms \| 133 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| nqueens \| 97.2 ms \| 96.6 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| pathlib \| 19.4 ms \| 19.7 ms: 1.02x slower (+2%) \| +-------------------------+----------+------------------------------+ \| pickle \| 8.10 us \| 9.07 us: 1.12x slower (+12%) \| +-------------------------+----------+------------------------------+ \| pickle_dict \| 23.1 us \| 18.6 us: 1.25x faster (-20%) \| +-------------------------+----------+------------------------------+ \| pickle_list \| 3.64 us \| 2.81 us: 1.30x faster (-23%) \| +-------------------------+----------+------------------------------+ \| pickle_pure_python \| 470 us \| 460 us: 1.02x faster (-2%) \| +-------------------------+----------+------------------------------+ \| pidigits \| 169 ms \| 173 ms: 1.02x slower (+2%) \| +-------------------------+----------+------------------------------+ \| python_startup \| 7.94 ms \| 8.02 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| python_startup_no_site \| 5.44 ms \| 5.49 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| raytrace \| 495 ms \| 490 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| regex_dna \| 172 ms \| 179 ms: 1.04x slower (+4%) \| +-------------------------+----------+------------------------------+ \| regex_effbot \| 2.95 ms \| 2.85 ms: 1.04x faster (-3%) \| +-------------------------+----------+------------------------------+ \| regex_v8 \| 20.7 ms \| 21.5 ms: 1.04x slower (+4%) \| +-------------------------+----------+------------------------------+ \| richards \| 68.9 ms \| 69.8 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| scimark_sparse_mat_mult \| 4.57 ms \| 4.29 ms: 1.07x faster (-6%) \| +-------------------------+----------+------------------------------+ \| spectral_norm \| 134 ms \| 133 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| sqlalchemy_declarative \| 161 ms \| 163 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| sqlalchemy_imperative \| 30.6 ms \| 31.0 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| sqlite_synth \| 2.90 us \| 2.95 us: 1.02x slower (+2%) \| +-------------------------+----------+------------------------------+ \| sympy_expand \| 422 ms \| 418 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ \| sympy_integrate \| 19.0 ms \| 19.2 ms: 1.01x slower (+1%) \| +-------------------------+----------+------------------------------+ \| sympy_sum \| 89.6 ms \| 91.7 ms: 1.02x slower (+2%) \| +-------------------------+----------+------------------------------+ \| telco \| 6.06 ms \| 6.28 ms: 1.04x slower (+4%) \| +-------------------------+----------+------------------------------+ \| tornado_http \| 178 ms \| 181 ms: 1.02x slower (+2%) \| +-------------------------+----------+------------------------------+ \| unpickle_list \| 3.97 us \| 3.78 us: 1.05x faster (-5%) \| +-------------------------+----------+------------------------------+ \| unpickle_pure_python \| 326 us \| 324 us: 1.00x faster (-0%) \| +-------------------------+----------+------------------------------+ \| xml_etree_generate \| 90.6 ms \| 91.0 ms: 1.00x slower (+0%) \| +-------------------------+----------+------------------------------+ \| xml_etree_process \| 72.0 ms \| 71.4 ms: 1.01x faster (-1%) \| +-------------------------+----------+------------------------------+ Not significant (15): deltablue; django_template; hexiom; html5lib; logging_simple; regex_compile; scimark_fft; scimark_lu; scimark_monte_carlo; scimark_sor; sympy_str; unpack_sequence; unpickle; xml_etree_parse; xml_etree_iterparse

> Decreasing the total wall time for a default --enable-optimizations build would 
> be a good thing for everyone, provided the resulting interpreter remains 
> "effectively similar" in speed.  If you somehow manage to find something that
> actually speeds up the resulting interpreter, amazing!

I spent quite a lot of time making different PGO builds and comparing with pyperformance.  The current PGO task is *really* slow.  Just running the PROFILE_TASK takes 24 minutes on my decently fast PC.

Using this set of tests seems to work pretty well:

PROFILE_TASK=-m test.regrtest --pgo \
        test_collections \
        test_dataclasses \
        test_difflib \
        test_embed \
        test_float \
        test_functools \
        test_generators \
        test_int \
        test_itertools \
        test_json \
        test_logging \
        test_long \
        test_ordered_dict \
        test_pickle \
        test_pprint \
        test_re \
        test_set \
        test_statistics \
        test_struct \
        test_tabnanny \
        test_xml_etree

Instead of 24 minutes, the above task takes one and a half minutes.  pyperformance results seem largely unchanged.  Comparison below.  Tuning the tests to get the best pyperformance result is a bit dangerous and perhaps running the whole test suite is safer (i.e. we are not optimizing for specific benchmarks).  I didn't tweak the list too much.  I added test_int, test_long, test_struct and test_itertools as a result of my pyperformance runs.  Not too surprising those are important modules.

I think the set of tests above should do a pretty good job of covering the hot code paths in most Python programs.  So, maybe it is good enough given the massive speedup in build time.



+-------------------------+----------+------------------------------+
| Benchmark               | task-all | task-short                   |
+=========================+==========+==============================+
| 2to3                    | 311 ms   | 315 ms: 1.01x slower (+1%)   |
+-------------------------+----------+------------------------------+
| chaos                   | 111 ms   | 108 ms: 1.02x faster (-2%)   |
+-------------------------+----------+------------------------------+
| crypto_pyaes            | 114 ms   | 112 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| dulwich_log             | 78.0 ms  | 78.7 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| fannkuch                | 470 ms   | 452 ms: 1.04x faster (-4%)   |
+-------------------------+----------+------------------------------+
| float                   | 118 ms   | 117 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| go                      | 253 ms   | 255 ms: 1.01x slower (+1%)   |
+-------------------------+----------+------------------------------+
| json_dumps              | 12.5 ms  | 11.8 ms: 1.06x faster (-6%)  |
+-------------------------+----------+------------------------------+
| json_loads              | 26.3 us  | 25.4 us: 1.04x faster (-3%)  |
+-------------------------+----------+------------------------------+
| logging_format          | 9.53 us  | 9.66 us: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| logging_silent          | 198 ns   | 196 ns: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| mako                    | 15.2 ms  | 15.8 ms: 1.04x slower (+4%)  |
+-------------------------+----------+------------------------------+
| meteor_contest          | 98.2 ms  | 96.8 ms: 1.01x faster (-1%)  |
+-------------------------+----------+------------------------------+
| nbody                   | 135 ms   | 133 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| nqueens                 | 97.2 ms  | 96.6 ms: 1.01x faster (-1%)  |
+-------------------------+----------+------------------------------+
| pathlib                 | 19.4 ms  | 19.7 ms: 1.02x slower (+2%)  |
+-------------------------+----------+------------------------------+
| pickle                  | 8.10 us  | 9.07 us: 1.12x slower (+12%) |
+-------------------------+----------+------------------------------+
| pickle_dict             | 23.1 us  | 18.6 us: 1.25x faster (-20%) |
+-------------------------+----------+------------------------------+
| pickle_list             | 3.64 us  | 2.81 us: 1.30x faster (-23%) |
+-------------------------+----------+------------------------------+
| pickle_pure_python      | 470 us   | 460 us: 1.02x faster (-2%)   |
+-------------------------+----------+------------------------------+
| pidigits                | 169 ms   | 173 ms: 1.02x slower (+2%)   |
+-------------------------+----------+------------------------------+
| python_startup          | 7.94 ms  | 8.02 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| python_startup_no_site  | 5.44 ms  | 5.49 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| raytrace                | 495 ms   | 490 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| regex_dna               | 172 ms   | 179 ms: 1.04x slower (+4%)   |
+-------------------------+----------+------------------------------+
| regex_effbot            | 2.95 ms  | 2.85 ms: 1.04x faster (-3%)  |
+-------------------------+----------+------------------------------+
| regex_v8                | 20.7 ms  | 21.5 ms: 1.04x slower (+4%)  |
+-------------------------+----------+------------------------------+
| richards                | 68.9 ms  | 69.8 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| scimark_sparse_mat_mult | 4.57 ms  | 4.29 ms: 1.07x faster (-6%)  |
+-------------------------+----------+------------------------------+
| spectral_norm           | 134 ms   | 133 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| sqlalchemy_declarative  | 161 ms   | 163 ms: 1.01x slower (+1%)   |
+-------------------------+----------+------------------------------+
| sqlalchemy_imperative   | 30.6 ms  | 31.0 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| sqlite_synth            | 2.90 us  | 2.95 us: 1.02x slower (+2%)  |
+-------------------------+----------+------------------------------+
| sympy_expand            | 422 ms   | 418 ms: 1.01x faster (-1%)   |
+-------------------------+----------+------------------------------+
| sympy_integrate         | 19.0 ms  | 19.2 ms: 1.01x slower (+1%)  |
+-------------------------+----------+------------------------------+
| sympy_sum               | 89.6 ms  | 91.7 ms: 1.02x slower (+2%)  |
+-------------------------+----------+------------------------------+
| telco                   | 6.06 ms  | 6.28 ms: 1.04x slower (+4%)  |
+-------------------------+----------+------------------------------+
| tornado_http            | 178 ms   | 181 ms: 1.02x slower (+2%)   |
+-------------------------+----------+------------------------------+
| unpickle_list           | 3.97 us  | 3.78 us: 1.05x faster (-5%)  |
+-------------------------+----------+------------------------------+
| unpickle_pure_python    | 326 us   | 324 us: 1.00x faster (-0%)   |
+-------------------------+----------+------------------------------+
| xml_etree_generate      | 90.6 ms  | 91.0 ms: 1.00x slower (+0%)  |
+-------------------------+----------+------------------------------+
| xml_etree_process       | 72.0 ms  | 71.4 ms: 1.01x faster (-1%)  |
+-------------------------+----------+------------------------------+

Not significant (15): deltablue; django_template; hexiom; html5lib; logging_simple; regex_compile; scimark_fft; scimark_lu; scimark_monte_carlo; scimark_sor; sympy_str; unpack_sequence; unpickle; xml_etree_parse; xml_etree_iterparse

History
Date	User	Action	Args
2019-07-11 03:36:07	nascheme	set	recipients: + nascheme, gregory.p.smith, methane, pablogsal, tianon
2019-07-11 03:36:07	nascheme	set	messageid: <1562816167.52.0.240635961399.issue36044@roundup.psfhosted.org>
2019-07-11 03:36:07	nascheme	link	issue36044 messages
2019-07-11 03:36:06	nascheme	create