Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regrtest: change -R/--huntrleaks rule to decide if a test leaks references #74959

Closed
vstinner opened this issue Jun 26, 2017 · 16 comments
Closed
Assignees
Labels
3.7 (EOL) end of life tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 30776
Nosy @pitrou, @vstinner, @serhiy-storchaka
PRs
  • bpo-30776: reduce regrtest -R false positives #2422
  • [3.6] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master #2441
  • [3.5] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master #2442
  • [2.7] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master #2444
  • bpo-30776: regrtest: reduce memleak false positive #2484
  • [3.6] bpo-29512, bpo-30776: Backport regrtest enhancements from master to 3.6 #2513
  • [3.5] bpo-29512, bpo-30764, bpo-30776: Backport regrtest enhancements from 3.6 to 3.5 #2540
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/vstinner'
    closed_at = <Date 2018-09-19.23:10:15.484>
    created_at = <Date 2017-06-26.21:49:23.239>
    labels = ['3.7', 'tests']
    title = 'regrtest: change -R/--huntrleaks rule to decide if a test leaks references'
    updated_at = <Date 2018-09-19.23:10:15.483>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-09-19.23:10:15.483>
    actor = 'vstinner'
    assignee = 'vstinner'
    closed = True
    closed_date = <Date 2018-09-19.23:10:15.484>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2017-06-26.21:49:23.239>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30776
    keywords = []
    message_count = 16.0
    messages = ['296950', '296952', '296977', '296978', '297034', '297041', '297044', '297085', '297262', '297420', '297565', '301066', '301071', '301078', '301080', '301102']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'vstinner', 'serhiy.storchaka']
    pr_nums = ['2422', '2441', '2442', '2444', '2484', '2513', '2540']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue30776'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    Currently, regrtest considers that a test file leaks if at least one run increased or decreased the global reference counter. The problem is that in practice, Python is full of singletons, caches, and other jokes of the garbage collector, etc.

    To reduce false alarms and focus more on real bugs, I suggest to change the reference difference checker from:

         def check_rc_deltas(deltas):
            return any(deltas)

    to:

         def check_rc_deltas(deltas):
            return all(delta>=1 for delta in deltas)

    It would allow to ignore false positives like:

    • [3, 0, 0]
    • [0, 1, 0]
    • [8, -8, 1]

    @vstinner vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir labels Jun 26, 2017
    @vstinner
    Copy link
    Member Author

    I had to write this change to be able to bisect bpo-30775 reference leak. Without this change, my bisection tool picked the wrong tests and failed to find the tests which leaked. For example, it took a subtest of tests which was considered as leaking because the reference differences was [3, 0, 0]. This pattern more looks like a late initialization than a real leak. In the case of the bpo-30775, a leak was closer to [222, 222, 225]: much more references than just 3 :-) Many leaks for all runs :-)

    @vstinner
    Copy link
    Member Author

    Real-world examples of false positives from the x86 Gentoo Refleaks 2.7 buildbot:

    http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%202.7/builds/32/steps/test/logs/stdio

    test_nntplib leaked [0, 85, 0] references, sum=85
    test_multiprocessing leaked [0, 0, -35] references, sum=-35

    @vstinner
    Copy link
    Member Author

    New changeset 48b5c42 by Victor Stinner in branch 'master':
    bpo-30776: reduce regrtest -R false positives (bpo-2422)
    48b5c42

    @vstinner
    Copy link
    Member Author

    New changeset 35d2ca2 by Victor Stinner in branch '3.6':
    [3.6] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master (bpo-2441)
    35d2ca2

    @vstinner
    Copy link
    Member Author

    New changeset de1850b by Victor Stinner in branch '3.5':
    [3.5] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master (bpo-2442)
    de1850b

    @vstinner
    Copy link
    Member Author

    New changeset fea98bf by Victor Stinner in branch '2.7':
    [2.7] bpo-30523, bpo-30764, bpo-30776: Sync regrtest from master (bpo-2444)
    fea98bf

    @vstinner
    Copy link
    Member Author

    It seems like the change works well since I didn't see any more false alarms on the Refleaks buildbots. I close the issue. The change has been applied to 2.7, 3.5, 3.6 and master branches.

    @vstinner vstinner changed the title regrtest: change -R/--huntrleaks rule to decide if a test leaks regrtest: change -R/--huntrleaks rule to decide if a test leaks references Jun 28, 2017
    @vstinner
    Copy link
    Member Author

    New changeset beeca6e by Victor Stinner in branch 'master':
    bpo-30776: regrtest: reduce memleak false positive (bpo-2484)
    beeca6e

    @vstinner
    Copy link
    Member Author

    New changeset a3ca94d by Victor Stinner in branch '3.6':
    [3.6] bpo-29512, bpo-30776: Backport regrtest enhancements from master to 3.6 (bpo-2513)
    a3ca94d

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2017

    New changeset 0b12107 by Victor Stinner in branch '3.5':
    [3.5] bpo-29512, bpo-30764, bpo-30776: Backport regrtest enhancements from 3.6 to 3.5 (bpo-2540)
    0b12107

    @pitrou
    Copy link
    Member

    pitrou commented Aug 31, 2017

    This change suddenly made detection of reference leaks a lot less effective. Now if you have a single 0 in the deltas list, it will consider there is no reference leak, even though all other values might be > 0.

    See https://bugs.python.org/issue31317#msg301064 for an example.

    Victor, it would be nice if you could find a less brutal heuristic.

    @vstinner
    Copy link
    Member Author

    Antoine Pitrou: "Victor, it would be nice if you could find a less brutal heuristic."

    My current goal is to get green buildbots. If a buildbot is always red, it's hard to notify easily regressions.

    Since I rejected "false alarms" from refleak, the refleak buildbot already helped to identify a reference leak regression. I am talking about the Windows Refleak buildbot:

    http://buildbot.python.org/all/builders/AMD64%20Windows8.1%20Refleaks%203.x

    I'm now working on the x86 Gentoo Refleaks 3.x which raised strange issues like bpo-31217.

    I am already making regrtest stricter, but slowly, without breaking buildbots. It's a slow process. That's why I'm to "fix" dangling threads for example. By the way, I'm not done with dangling threads in master yet.

    I'm happy if anyone is interested to help me to work on buildbots, which means in practice fixing a new different bug everyday :-) Slowly, the remaining bugs are the most complex ones.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 1, 2017

    Since I rejected "false alarms" from refleak, the refleak buildbot already helped to identify a reference leak regression.

    Except that now it may hide actual regressions... False negatives are not better than false positives.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 1, 2017

    Except that now it may hide actual regressions... False negatives are not better than false positives.

    Please put this issue in its context. 6 months ago, there was no more Refleak buildbot and everything was broken.

    If you want to make regrtest -R stricer, go ahead. But if a buildbot breaks, you will have to fix it.

    I would prefer to first fix the Gentoo Refleak buildbot, since most previous builds failed.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 1, 2017

    The commit 6c2feab of bpo-31217 fixed a major random bug in the memory block check. So maybe we can start again to look to detect more bugs by replacing all() with any() again.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants