Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regrtest: reseed random with the same seed before running a test file #75410

Closed
vstinner opened this issue Aug 17, 2017 · 7 comments
Closed

regrtest: reseed random with the same seed before running a test file #75410

vstinner opened this issue Aug 17, 2017 · 7 comments
Labels
3.7 (EOL) end of life tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 31227
Nosy @rhettinger, @mdickinson, @pitrou, @vstinner, @ezio-melotti, @voidspace, @serhiy-storchaka
PRs
  • bpo-31227: regrtest reseeds RNG before each test #3059
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-24.09:27:23.124>
    created_at = <Date 2017-08-17.15:32:04.721>
    labels = ['3.7', 'tests']
    title = 'regrtest: reseed random with the same seed before running a test file'
    updated_at = <Date 2017-10-24.09:27:23.123>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-10-24.09:27:23.123>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-24.09:27:23.124>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2017-08-17.15:32:04.721>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31227
    keywords = []
    message_count = 7.0
    messages = ['300438', '300440', '300441', '300444', '300446', '300449', '304888']
    nosy_count = 7.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'pitrou', 'vstinner', 'ezio.melotti', 'michael.foord', 'serhiy.storchaka']
    pr_nums = ['3059']
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue31227'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    Attached PR changes regrtest to reseed the random RNG before each test file. Use also more entropy for the seed: 2**32 (32 bits) rather than
    10_000_000 (24 bits).

    The change should avoid random failure of test_tools when hunting reference leaks: see bpo-31174.

    Maybe it will also reduce false positive when hunting memory leaks, like bpo-31217.

    @vstinner vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir labels Aug 17, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Aug 17, 2017

    If refleaks depend on the random seed, perhaps it's a bug worth fixing?

    @vstinner
    Copy link
    Member Author

    Antoine Pitrou: "If refleaks depend on the random seed, perhaps it's a bug worth fixing?"

    I propose to change regrtest behaviour even when -R is not used, to make regrtest more deterministic.

    Currently, when you run "./python -m test -r test_xxx test_yyyy", it's hard to guess the state of the RNG in test_yyy: it depends on many bytes were consumed by test_xxx. For example, if test_xxx is run on a buildbot, but skipped when I run it locally: we get a different behaviour.

    I would prefer that test_yyy behaves the same when run with "./python -m test -r --randseed=5 test_xxx test_yyyy" (with test_xxx) and with "./python -m test -r --randseed=5 test_yyyy" (without test_xxx).

    With my change, "./python -m test -r --randseed=5 test_yyyy test_yyyy" (sequential) and "./python -m test -r --randseed=5 -j2 test_yyyy test_yyyy" (parallel) runs test_yyy twice with the RNG in the same state.

    Proposed change is part of a more global project to reduce side effects of tests, to make tests more reproductible and more "isolated".

    @vstinner
    Copy link
    Member Author

    I'm not sure if we should use the same RNG seed for all tests, or create one seed per test when the option -r is used.

    For example, I expect that "./python -m test -r -F test_tools" will catch a random bug which only occurs for a specific random seed.

    @serhiy-storchaka
    Copy link
    Member

    The PRNG is not the only source of the randomness in the tests. The exact behavior depends on the order of files in directories, on string hashes randomization, on address randomization, and on many other things out of our control. Couldn't reseeding the PRNG just add a false promise? The success in making tests deterministic can also narrow down the coverage of the testing. Some branches that lead to failures can be never executed. Our target not just making tests always success, but catch and fix even pretty rare errors.

    @vstinner
    Copy link
    Member Author

    The exact behavior depends on the order of files in directories, on string hashes randomization, on address randomization, and on many other things out of our control.

    For hash randomization, maybe we need to generate a PYTHONHASHSEED, as tox test runner does.

    For the filesystem: right, it's not possible to get 100% reproductible tests, but IMHO it's worth it to make them more reliable.

    Couldn't reseeding the PRNG just add a false promise?

    I'm trying to fix random failures on the Refleaks buildbots, not to promise anything :-) To be honest, at this point, I don't know if it would be enough since I'm unable to reproduce bugs...

    The success in making tests deterministic can also narrow down the coverage of the testing. Some branches that lead to failures can be never executed. Our target not just making tests always success, but catch and fix even pretty rare errors.

    I know that it's though question, and that's why I opened this issue, to discuss it :-)

    But I see more and more projects to get more reproductible softwares and tests:

    • https://reproducible-builds.org/
    • systemd big project to get more "stateless" computers, or said differently: to isolate better services
    • containers which also want to isolate services, "stateless" containers
    • etc.

    Other test runners, like tox, also make efforts to get reproductible tests, like setting PYTHONHASHSEED.

    @vstinner
    Copy link
    Member Author

    I didn't get a strong +1 on the issue and I'm not convinced myself by my approach. Moreover, Refleaks buildbots now seem to be reliable thanks to other fixes. For all these reasons, I close the issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants