Issue 31227: regrtest: reseed random with the same seed before running a test file

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/75410

classification

Title:	regrtest: reseed random with the same seed before running a test file
Type:		Stage:	resolved
Components:	Tests	Versions:	Python 3.7, Python 3.6, Python 2.7

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	ezio.melotti, mark.dickinson, michael.foord, pitrou, rhettinger, serhiy.storchaka, vstinner
Priority:	normal	Keywords:

Created on 2017-08-17 15:32 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 3059	closed	vstinner, 2017-08-17 15:32

Messages (7)
msg300438 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-17 15:32
Attached PR changes regrtest to reseed the random RNG before each test file. Use also more entropy for the seed: 2**32 (32 bits) rather than 10_000_000 (24 bits). The change should avoid random failure of test_tools when hunting reference leaks: see bpo-31174. Maybe it will also reduce false positive when hunting memory leaks, like bpo-31217.
msg300440 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-08-17 15:35
If refleaks depend on the random seed, perhaps it's a bug worth fixing?
msg300441 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-17 15:46
Antoine Pitrou: "If refleaks depend on the random seed, perhaps it's a bug worth fixing?" I propose to change regrtest behaviour even when -R is not used, to make regrtest more deterministic. Currently, when you run "./python -m test -r test_xxx test_yyyy", it's hard to guess the state of the RNG in test_yyy: it depends on many bytes were consumed by test_xxx. For example, if test_xxx is run on a buildbot, but skipped when I run it locally: we get a different behaviour. I would prefer that test_yyy behaves the same when run with "./python -m test -r --randseed=5 test_xxx test_yyyy" (with test_xxx) and with "./python -m test -r --randseed=5 test_yyyy" (without test_xxx). With my change, "./python -m test -r --randseed=5 test_yyyy test_yyyy" (sequential) and "./python -m test -r --randseed=5 -j2 test_yyyy test_yyyy" (parallel) runs test_yyy twice with the RNG in the same state. Proposed change is part of a more global project to reduce side effects of tests, to make tests more reproductible and more "isolated".
msg300444 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-17 16:16
I'm not sure if we should use the same RNG seed for all tests, or create one seed per test when the option -r is used. For example, I expect that "./python -m test -r -F test_tools" will catch a random bug which only occurs for a specific random seed.
msg300446 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-08-17 17:35
The PRNG is not the only source of the randomness in the tests. The exact behavior depends on the order of files in directories, on string hashes randomization, on address randomization, and on many other things out of our control. Couldn't reseeding the PRNG just add a false promise? The success in making tests deterministic can also narrow down the coverage of the testing. Some branches that lead to failures can be never executed. Our target not just making tests always success, but catch and fix even pretty rare errors.
msg300449 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-08-17 17:51
> The exact behavior depends on the order of files in directories, on string hashes randomization, on address randomization, and on many other things out of our control. For hash randomization, maybe we need to generate a PYTHONHASHSEED, as tox test runner does. For the filesystem: right, it's not possible to get 100% reproductible tests, but IMHO it's worth it to make them more reliable. > Couldn't reseeding the PRNG just add a false promise? I'm trying to fix random failures on the Refleaks buildbots, not to promise anything :-) To be honest, at this point, I don't know if it would be enough since I'm unable to reproduce bugs... > The success in making tests deterministic can also narrow down the coverage of the testing. Some branches that lead to failures can be never executed. Our target not just making tests always success, but catch and fix even pretty rare errors. I know that it's though question, and that's why I opened this issue, to discuss it :-) But I see more and more projects to get more reproductible softwares and tests: * https://reproducible-builds.org/ * systemd big project to get more "stateless" computers, or said differently: to isolate better services * containers which also want to isolate services, "stateless" containers * etc. Other test runners, like tox, also make efforts to get reproductible tests, like setting PYTHONHASHSEED.
msg304888 - (view)	Author: STINNER Victor (vstinner) *	Date: 2017-10-24 09:27
I didn't get a strong +1 on the issue and I'm not convinced myself by my approach. Moreover, Refleaks buildbots now seem to be reliable thanks to other fixes. For all these reasons, I close the issue.

History
Date	User	Action	Args
2022-04-11 14:58:50	admin	set	github: 75410
2017-10-24 09:27:23	vstinner	set	status: open -> closed resolution: rejected stage: resolved
2017-10-24 09:27:09	vstinner	set	messages: + msg304888
2017-08-17 17:51:42	vstinner	set	messages: + msg300449
2017-08-17 17:35:26	serhiy.storchaka	set	nosy: + rhettinger, mark.dickinson, ezio.melotti, michael.foord messages: + msg300446
2017-08-17 16:16:26	vstinner	set	messages: + msg300444
2017-08-17 15:46:27	vstinner	set	messages: + msg300441
2017-08-17 15:35:40	pitrou	set	messages: + msg300440
2017-08-17 15:33:59	vstinner	set	nosy: + pitrou, serhiy.storchaka
2017-08-17 15:32:13	vstinner	set	pull_requests: + pull_request3159
2017-08-17 15:32:04	vstinner	create