classification
Title: test__xxsubinterpreters crashed on x86 Gentoo Refleaks 3.x
Type: Stage: resolved
Components: Interpreter Core, Tests Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.snow Nosy List: eric.snow, ned.deily, pablogsal, vstinner
Priority: Keywords: patch

Created on 2018-05-23 12:25 by vstinner, last changed 2018-06-14 15:59 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7251 merged eric.snow, 2018-05-30 16:35
PR 7288 merged eric.snow, 2018-05-31 15:05
PR 7503 merged vstinner, 2018-06-08 00:18
PR 7552 merged eric.snow, 2018-06-09 00:24
Messages (25)
msg317394 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-23 12:25
http://buildbot.python.org/all/#/builders/1/builds/232

(...)
3:28:16 load avg: 3.67 [401/416/3] test__xxsubinterpreters crashed (Exit code -6) -- running: test_asyncio (4631 sec)
python: Modules/gcmodule.c:277: visit_decref: Assertion `_PyGCHead_REFS(gc) != 0' failed.
Fatal Python error: Aborted

Current thread 0xb748a700 (most recent call first):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/case.py", line 615 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/case.py", line 663 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/support/__init__.py", line 1781 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/support/__init__.py", line 1882 in _run_suite
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/support/__init__.py", line 1972 in run_unittest
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 175 in test_runner
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 176 in runtest_inner
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 140 in runtest
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest_mp.py", line 67 in run_tests_slave
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 517 in _main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 510 in main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 585 in main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/regrtest.py", line 46 in _main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/regrtest.py", line 50 in <module>
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/runpy.py", line 85 in _run_code
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/runpy.py", line 193 in _run_module_as_main
(...)
Re-running test 'test__xxsubinterpreters' in verbose mode
(...)
test_recv_not_found (test.test__xxsubinterpreters.ChannelTests) ... ok
test_run_string_arg_resolved (test.test__xxsubinterpreters.ChannelTests) ... python: Modules/gcmodule.c:277: visit_decref: Assertion `_PyGCHead_REFS(gc) != 0' failed.
Fatal Python error: Aborted
Current thread 0xb74b9700 (most recent call first):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/case.py", line 615 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/case.py", line 663 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 122 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/suite.py", line 84 in __call__
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/unittest/runner.py", line 176 in run
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/support/__init__.py", line 1882 in _run_suite
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/support/__init__.py", line 1972 in run_unittest
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 175 in test_runner
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 176 in runtest_inner
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/runtest.py", line 140 in runtest
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 291 in rerun_failed_tests
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 540 in _main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 510 in main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/libregrtest/main.py", line 585 in main
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/__main__.py", line 2 in <module>
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/runpy.py", line 85 in _run_code
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/runpy.py", line 193 in _run_module_as_main
msg317415 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-23 15:10
I'll take a look.  Thanks!
msg317462 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-23 21:50
FTR, this started happening after the following commit:

commit 6d2cd9036c0ab78a83de43d1511befb7a7fc0ade
Author: Eric Snow <ericsnowcurrently@gmail.com>
Date:   Wed May 16 15:04:57 2018 -0400

    bpo-32604: Improve subinterpreter tests. (#6914)

    Add more tests for subinterpreters. This patch also fixes a few small defects in the channel implementation.
msg317477 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-23 23:34
There are a couple of other buildbots with the same failure:

ARMv7 Ubuntu 3.x:  http://buildbot.python.org/all/#/builders/106/builds/1066
PPC64 AIX 3.x:  http://buildbot.python.org/all/#/builders/10/builds/1005
msg317637 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-24 23:05
It would be nice to fix this bug before Python 3.7.0 final: either skip the test, or fix it.

Since the functions are still private, skipping a single test (until it's fixed) should be fine.
msg317641 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-05-24 23:18
Victor:

>It would be nice to fix this bug before Python 3.7.0 final: either skip the test, or fix it.

These tests (and failures) are only on master / 3.x, not 3.7, right?  If so, they have no bearing on 3.7.0.
msg317645 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-24 23:48
Correct.  These failures are only on master.
msg317646 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-24 23:48
Oh, I didn't know that it was a 3.8-only issue.
msg317647 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-24 23:50
no worries :)
msg317961 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 23:43
Any progress on this issue? It's still crashing the ARMv7 Ubuntu 3.x buildbot.

Eric: are you able to reproduce the issue?
msg318038 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-29 14:26
I haven't been able to reproduce the issue thus far.  From the assert in the buildbot logs, it's clear that I've decref'ed an object one too many times.  Given the relevant PR, it's probably something I changed in the channel implementation.  It could also be something that wasn't triggered until the tests added in the PR.

Any thoughts on what might be common to the 3 failing buildbots, such that they're the only ones to hit the assert?

* x86 Gentoo Refleaks 3.x
* ARMv7 Ubuntu 3.x
* PPC64 AIX 3.x
msg318094 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-05-29 21:28
Buildbot AMD64 FreeBSD 10.x Shared 3.x is failing with the same problem:

Assertion failed: (_PyGCHead_REFS(gc) != 0), function visit_decref, file Modules/gcmodule.c, line 277.
Fatal Python error: Aborted
msg318332 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-31 16:17
New changeset 110bc01407ac8c75545d0386577c6e17254d97d9 by Eric Snow in branch 'master':
bpo-33615: Temporarily disable a test that is triggering crashes on a few buildbots. (gh-7288)
https://github.com/python/cpython/commit/110bc01407ac8c75545d0386577c6e17254d97d9
msg318346 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-31 21:52
Now the test runs but doesn't crash anymore: bpo-33724.
msg318349 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-05-31 22:42
FYI, I plan on closing this issue only *after* I've re-enabled the crashing test and it passes. :)
msg318478 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-02 00:45
New changeset 63799136e6c0491bb5d6f4a234d5a775db3458db by Eric Snow in branch 'master':
bpo-33615: Re-enable a subinterpreter test. (gh-7251)
https://github.com/python/cpython/commit/63799136e6c0491bb5d6f4a234d5a775db3458db
msg318653 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-04 14:43
This appears to be recurring on the "x86 Gentoo Refleaks 3.x" builder still.  I was thrown off by the success of the first run after I landed my fix:

http://buildbot.python.org/all/#/builders/1/builds/241

FYI, the other buildbots having this issue before (e.g. "ARMv7 Ubuntu 3.x") are still passing.
msg318660 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-04 15:20
FTR, bpo-33724 is related
msg318692 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-04 20:18
Same error in AMD64 Windows10 3.x:

http://buildbot.python.org/all/#/builders/3/builds/941
msg318995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 00:28
New changeset c4f3cb772bc2d93d91ee1750eed817262f3ed57d by Victor Stinner in branch 'master':
bpo-33615: Skip test__xxsubinterpreters (GH-7503)
https://github.com/python/cpython/commit/c4f3cb772bc2d93d91ee1750eed817262f3ed57d
msg318996 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-08 00:28
test__xxsubinterpreters prevents to get results from Gentoo Refleaks 3.x and Windows Refleaks 3.x, it also broke multiple CIs and it introduced random failures. For all these reasons, I skipped the test. See the general policy for CIs:
https://mail.python.org/pipermail/python-dev/2018-May/153753.html

Eric: if you need CIs to check if a change fix test__xxsubinterpreters, they are ways to trigger custom builds, but I don't recall how to do that :-D You should be able to do that on buildbots at least, the devguide explains how to do it, or ask maybe Zachary Ware. Until the crash is fixed, I would prefer to leave the test skipped.
msg319067 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-08 14:21
Yeah, I did a custom build the other day.  Sorry about the delay in disabling the test again and thanks for getting it done.
msg319466 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-13 14:02
New changeset ab4a1988fd4347484a7928394b94e2cdf5f8f2a7 by Eric Snow in branch 'master':
bpo-33615: Re-enable subinterpreter tests. (#7552)
https://github.com/python/cpython/commit/ab4a1988fd4347484a7928394b94e2cdf5f8f2a7
msg319470 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2018-06-13 14:58
I've re-enabled the subinterpreter tests, but left the one problem test (ChannelTests.test_run_string_arg_resolved) disabled.  I also changed all uses of %lld to use PRId64 instead.  (Thanks, Victor, for the suggestion.)

The buildbots look good.  I'll keep an eye on "x86 Gentoo Refleaks 3.x" for the next time it runs (starts every 24 hours; 10 hours from now).  When that passes I'll close this issue.
msg319526 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-14 15:59
It's nice to see this issue fixed :-)
History
Date User Action Args
2018-06-14 15:59:56vstinnersetmessages: + msg319526
2018-06-14 14:49:49eric.snowsetstatus: pending -> closed
2018-06-13 14:58:00eric.snowsetstatus: open -> pending
resolution: fixed
messages: + msg319470

stage: patch review -> resolved
2018-06-13 14:02:50eric.snowsetmessages: + msg319466
2018-06-09 00:24:05eric.snowsetpull_requests: + pull_request7184
2018-06-08 14:21:36eric.snowsetmessages: + msg319067
2018-06-08 00:28:48vstinnersetmessages: + msg318996
2018-06-08 00:28:31vstinnersetmessages: + msg318995
2018-06-08 00:18:10vstinnersetstage: needs patch -> patch review
pull_requests: + pull_request7131
2018-06-04 20:18:23pablogsalsetmessages: + msg318692
2018-06-04 15:20:16eric.snowsetmessages: + msg318660
2018-06-04 14:43:47eric.snowsetstatus: pending -> open
resolution: fixed -> (no value)
messages: + msg318653

stage: resolved -> needs patch
2018-06-02 00:46:37eric.snowsetstatus: open -> pending
resolution: fixed
stage: patch review -> resolved
2018-06-02 00:45:23eric.snowsetmessages: + msg318478
2018-05-31 22:42:47eric.snowsetmessages: + msg318349
2018-05-31 21:52:21vstinnersetmessages: + msg318346
2018-05-31 16:17:34eric.snowsetmessages: + msg318332
2018-05-31 15:05:36eric.snowsetpull_requests: + pull_request6914
2018-05-30 16:35:01eric.snowsetkeywords: + patch
stage: patch review
pull_requests: + pull_request6876
2018-05-29 21:28:45pablogsalsetnosy: + pablogsal
messages: + msg318094
2018-05-29 14:26:58eric.snowsetmessages: + msg318038
2018-05-28 23:43:21vstinnersetmessages: + msg317961
2018-05-24 23:50:56eric.snowsetmessages: + msg317647
2018-05-24 23:48:28vstinnersetmessages: + msg317646
2018-05-24 23:48:01eric.snowsetmessages: + msg317645
2018-05-24 23:18:44ned.deilysetpriority: deferred blocker ->
nosy: + ned.deily
messages: + msg317641

2018-05-24 23:05:40vstinnersetpriority: normal -> deferred blocker

messages: + msg317637
2018-05-23 23:34:29eric.snowsetmessages: + msg317477
2018-05-23 21:50:41eric.snowsetmessages: + msg317462
2018-05-23 15:10:53eric.snowsetassignee: eric.snow
messages: + msg317415
2018-05-23 12:25:16vstinnercreate