This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_peg_generator takes 8 minutes on Windows
Type: Stage: commit review
Components: Tests Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, eric.snow, gregory.p.smith, kumaraditya, lys.nikolaou, miss-islington, pablogsal, terry.reedy, tim.peters, vstinner, xtreak
Priority: normal Keywords: patch

Created on 2022-01-25 19:06 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 31017 closed kumaraditya, 2022-01-30 08:03
PR 31015 merged gregory.p.smith, 2022-01-30 08:43
PR 31089 merged gregory.p.smith, 2022-02-02 20:22
PR 31093 merged miss-islington, 2022-02-03 04:03
Messages (19)
msg411662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-25 19:06
test_peg_generator takes between 5 and 16 minutes on the Windows CI run on Python pull requests.

Example of timings on my PR https://github.com/python/cpython/pull/30890

* GitHub Action win32: 8 min 16 sec
* GitHub Action win64: 16 min 38 sec
* Azure Pipelines win32: 5 min 30 sec
* Azure Pipelines win64: 8 min 3 sec

Would it be possible to make these tests faster?

Or at least, would it be possible to skip these slow tests on Windows where spawing a process or running a C compiler is slower?

On my Fedora 35 (Linux) laptop, test_peg_generator takes 49.5 seconds. test_peg_generator only takes 736 ms if I disable TestCParser.
msg411672 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-25 20:33
The test needs to build a lot of C extensions with different parsers, and that compilation is what takes most of the time.

I don't think we should skip these tests by default on Windows, as it gives us valuable information (that the parser features work compiled on windows).

If you have an idea on how to speed the tests, I'm all ears :)
msg411675 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2022-01-25 21:20
As a general thing, I expect people on Windows always run the tests with multiple processes. In which case it would be generally helpful to start the longest-running tests first. As is, because of its late-in-the-alphabet name, "test_peg_generator" gets started late in the process and so typically keeps running for minutes and minutes after all other tests have completed. So just renaming it to  "test_0peg_generator" would cut minutes off the Windows wall clock time ;-)
msg411685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-25 23:14
> In which case it would be generally helpful to start the longest-running tests first.

Currently, most CI run "make buildbottest" which uses -r option of libregrtest: randomize tests order.
msg411743 - (view) Author: Kumar Aditya (kumaraditya) * (Python triager) Date: 2022-01-26 12:29
Would it be possible to skip these tests when no changes to the parser  related code are made to speed up tests?
msg411756 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-26 14:32
> Would it be possible to skip these tests when no changes to the parser  related code are made to speed up tests?

Maybe, some CIs could export an environment variable which contains the list of modified files, and then each file would be able to decide if it should be skipped or not. The risk is skipping a test because the change looks "unrelated", whereas it causes the test to fail.

Currently, we only skip tests if no code is changed: if only the documentation is modified.
msg411988 - (view) Author: Kumar Aditya (kumaraditya) * (Python triager) Date: 2022-01-28 12:17
Another solution would be to shard the tests on windows i.e. run tests on two different jobs concurrently, edgedb does this. 
See https://github.com/edgedb/edgedb/actions/runs/1746736086
msg411989 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2022-01-28 12:34
It's plenty slow on linux as well.

I like the idea of starting the slower tests first.
The long tail of slow tests is annoying when running `make -j12 test`.
msg412016 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2022-01-28 17:00
On Tue, Jan 25, 2022 at 4:14 PM STINNER Victor <report@bugs.python.org> wrote:
> Currently, most CI run "make buildbottest" which uses -r option of libregrtest: randomize tests order.

How hard would it be to first randomize the list and then move the
slow tests up to a random position in the first half (for example) of
the list?
msg412054 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2022-01-29 05:57
Do all of the tests use all of the slowly built extensions, or could the test file be split into multiple files run separately, each faster?
msg412127 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2022-01-30 04:53
See also https://bugs.python.org/issue46576 and https://github.com/python/cpython/pull/31015
msg412131 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-30 07:49
re: slow tests in the first half of the list.  the same total amount of time is going to be spent regardless.  In our test suite on a modern fast 16 thread system, all but 10 tests are completed in parallel within the first 30 seconds.  The remaining ~10 take 10x+ that wall time more minutes.

So the most latency you will shave off on a modern system is probably <30 seconds.  On a slower system the magnitude of that will remain the same in proportion.  CI systems are not workstations.  On -j1 or -j2 system I doubt it will make a meaningful difference at all.

Picture test execution as a utilization graph:

```
|ttttttttttttttttttttttt
|                       tttt
|                           ttt
|                              tttttttttt
+----------------------------------------
```

The total area under that curve is going to remain the same no matter what so long as we execute everything.  Reordering the tests can pull the final long tail in a bit by pushing out the top layer.  You move more towards an optimal rectangle, but you're still limited by the area.  **The less -jN parallelism you have as CPU cores the less difference any reordering change makes.**

What actual parallelism do our Github CI systems offer?

The fundamental problem is that we do a LOT in our test suite and have no concept of what depends on what and thus _needs_ to be run.  So we run it all.  For specialized tests like test_peg_generator and test_tools it should be easy to determine from a list of modified files if those tests are relevant.

That gets a lot more complicated to accurately express for things like test_multiprocessing and test_concurrent_futures.

test_peg_generator and test_tools are also *packages of tests* that themselves should be parallelized individually instead of considered a single serialized unit.

At work we even shard test methods within TestCase classes so that big ones can be split across test executor tasks: See the _setup_sharding() function in absltest here: https://github.com/abseil/abseil-py/blob/main/absl/testing/absltest.py#L2368

In absence of implementing an approach like that within test.regrtest to shard at a more granular level thus enabling us to approach the golden rectangle of optimal parallel test latency, we're left with manually splitting long running test module/packages up into smaller units to achieve a similar effect.
msg412133 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-30 08:04
If a decent parallelism CI systems are not available from github (they seem stuck at 2-3 threads per https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners), an alternative approach could be to shard across multiple parallel CI tasks. Shard such that each one gets only one of the slow tests.

Unfortunately if each of these were a line item in Github's poor CI UI sharding a single config's tests across 5-10 tasks would be a nightmare to navigate on a PR.  I expect everyone would hate that.

Providing our own runners with decent parallelism could help: https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners

But we're always going to be bound by our longest tail test if we don't fix our test parallelism to be more granular.
msg412395 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-02-02 20:15
New changeset 164a017e13ee96bd1ea1ae79f5ac9e25fe83994e by Gregory P. Smith in branch 'main':
bpo-46576: bpo-46524: Disable compiler optimization within test_peg_generator. (#31015)
https://github.com/python/cpython/commit/164a017e13ee96bd1ea1ae79f5ac9e25fe83994e
msg412415 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-02-03 04:03
New changeset f5ebec4d3e1199ec38b88920cfde8e460e5120dd by Gregory P. Smith in branch '3.10':
[3.10] bpo-46576: bpo-46524: Disable compiler optimization within test_peg_generator. (GH-31015) (GH-31089)
https://github.com/python/cpython/commit/f5ebec4d3e1199ec38b88920cfde8e460e5120dd
msg412417 - (view) Author: miss-islington (miss-islington) Date: 2022-02-03 04:33
New changeset e8258608c28c65680253d0ca6167430e34c2fd87 by Miss Islington (bot) in branch '3.9':
[3.9] [3.10] bpo-46576: bpo-46524: Disable compiler optimization within test_peg_generator. (GH-31015) (GH-31089) (GH-31093)
https://github.com/python/cpython/commit/e8258608c28c65680253d0ca6167430e34c2fd87
msg412427 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-02-03 08:35
test_peg_generator is significantly less of the long tail on optimized builds now.  CI extremely noisy performance wise (times vary by 2-3x without any differences anways) so I can't easily judge if this made a notable difference in windows CI latency.
msg412434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-02-03 10:19
Duration of the "Tests" step of https://github.com/python/cpython/pull/30890 

* GHA win32: 14 min 11 sec (test_peg_generator: 8 min 16 sec)
* GHA win64: 21 min 2 sec (test_peg_generator: 16 min 38 sec)
* Azure win32: 9 min 34 sec (test_peg_generator: 5 min 30 sec)
* Azure win64: 13 min 56 sec (test_peg_generator: 8 min 3 sec)

Duration of the "Tests" step of https://github.com/python/cpython/pull/31096 (which includes Gregory's change):

* GHA win32: 8 min 30 sec (test_peg_generator: 5 min 1 sec)
* GHA win64: 8 min 29 sec (test_peg_generator: 4 min 52 sec)
* Azure win32: 9 min 54 sec (test_peg_generator: 4 min 19 sec)
* Azure win64: 7 min 57 sec (test_peg_generator: 3 min 32 sec)

test_peg_generator is way faster, and the total tests duration is shorter especially the maximum time: 21 min (before) => 10 min (after).
msg412454 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-02-03 19:28
Thanks to Kumar for contributing Windows compiler flags side of this (the point of this issue).
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90682
2022-02-03 19:28:48gregory.p.smithsetstatus: open -> closed
resolution: fixed
messages: + msg412454

stage: patch review -> commit review
2022-02-03 10:19:11vstinnersetmessages: + msg412434
2022-02-03 08:35:11gregory.p.smithsetmessages: + msg412427
2022-02-03 04:33:03miss-islingtonsetmessages: + msg412417
2022-02-03 04:03:11miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request29279
2022-02-03 04:03:08gregory.p.smithsetmessages: + msg412415
2022-02-02 20:22:30gregory.p.smithsetpull_requests: + pull_request29274
2022-02-02 20:15:25gregory.p.smithsetmessages: + msg412395
2022-01-30 08:43:33gregory.p.smithsetpull_requests: + pull_request29197
2022-01-30 08:04:08gregory.p.smithsetmessages: + msg412133
2022-01-30 08:03:31kumaradityasetkeywords: + patch
stage: patch review
pull_requests: + pull_request29196
2022-01-30 07:49:31gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg412131
2022-01-30 04:53:03xtreaksetnosy: + xtreak
messages: + msg412127
2022-01-29 05:57:49terry.reedysetnosy: + terry.reedy
messages: + msg412054
2022-01-28 17:00:20eric.snowsetnosy: + eric.snow
messages: + msg412016
2022-01-28 12:34:44Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg411989
2022-01-28 12:17:28kumaradityasetmessages: + msg411988
2022-01-26 14:32:43vstinnersetmessages: + msg411756
2022-01-26 12:29:11kumaradityasetnosy: + kumaraditya
messages: + msg411743
2022-01-25 23:14:41vstinnersetmessages: + msg411685
2022-01-25 21:20:45tim.peterssetmessages: + msg411675
2022-01-25 20:33:54pablogsalsetmessages: + msg411672
2022-01-25 19:40:54tim.peterssetnosy: + tim.peters
2022-01-25 19:06:03vstinnercreate