This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gregory.p.smith
Recipients Mark.Shannon, eric.snow, gregory.p.smith, kumaraditya, lys.nikolaou, pablogsal, terry.reedy, tim.peters, vstinner, xtreak
Date 2022-01-30.07:49:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
re: slow tests in the first half of the list.  the same total amount of time is going to be spent regardless.  In our test suite on a modern fast 16 thread system, all but 10 tests are completed in parallel within the first 30 seconds.  The remaining ~10 take 10x+ that wall time more minutes.

So the most latency you will shave off on a modern system is probably <30 seconds.  On a slower system the magnitude of that will remain the same in proportion.  CI systems are not workstations.  On -j1 or -j2 system I doubt it will make a meaningful difference at all.

Picture test execution as a utilization graph:

|                       tttt
|                           ttt
|                              tttttttttt

The total area under that curve is going to remain the same no matter what so long as we execute everything.  Reordering the tests can pull the final long tail in a bit by pushing out the top layer.  You move more towards an optimal rectangle, but you're still limited by the area.  **The less -jN parallelism you have as CPU cores the less difference any reordering change makes.**

What actual parallelism do our Github CI systems offer?

The fundamental problem is that we do a LOT in our test suite and have no concept of what depends on what and thus _needs_ to be run.  So we run it all.  For specialized tests like test_peg_generator and test_tools it should be easy to determine from a list of modified files if those tests are relevant.

That gets a lot more complicated to accurately express for things like test_multiprocessing and test_concurrent_futures.

test_peg_generator and test_tools are also *packages of tests* that themselves should be parallelized individually instead of considered a single serialized unit.

At work we even shard test methods within TestCase classes so that big ones can be split across test executor tasks: See the _setup_sharding() function in absltest here:

In absence of implementing an approach like that within test.regrtest to shard at a more granular level thus enabling us to approach the golden rectangle of optimal parallel test latency, we're left with manually splitting long running test module/packages up into smaller units to achieve a similar effect.
Date User Action Args
2022-01-30 07:49:32gregory.p.smithsetrecipients: + gregory.p.smith, tim.peters, terry.reedy, vstinner, Mark.Shannon, eric.snow, lys.nikolaou, pablogsal, xtreak, kumaraditya
2022-01-30 07:49:31gregory.p.smithsetmessageid: <>
2022-01-30 07:49:31gregory.p.smithlinkissue46524 messages
2022-01-30 07:49:31gregory.p.smithcreate