classification
Title: AMD64 Arch Linux Asan 3.x fails: command timed out: 1200 seconds without output
Type: Stage: resolved
Components: Tests Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: orsenthil, pablogsal, vstinner
Priority: normal Keywords:

Created on 2021-01-20 21:18 by vstinner, last changed 2021-01-26 09:00 by vstinner. This issue is now closed.

Messages (10)
msg385373 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-20 21:18
The AMD64 Arch Linux Asan 3.x buildbot worker started to fail at build 262:
https://buildbot.python.org/all/#/builders/582/builds/262
-----------
...
0:39:09 load avg: 1.15 running: test_multiprocessing_forkserver (30.0 sec)
0:39:39 load avg: 1.64 running: test_multiprocessing_forkserver (1 min)
0:39:53 load avg: 1.50 [419/420] test_multiprocessing_forkserver passed (1 min 12 sec)
0:39:53 load avg: 1.50 [420/420] test_dynamicclassattribute passed

command timed out: 1200 seconds without output running (...)
process killed by signal 9
program finished with exit code -1
elapsedTime=3595.378040
-----------
msg385416 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-21 13:18
It seems like something changed on the buildbot, not in Python, since it also fails on 3.8 and 3.9.

AMD64 Arch Linux Asan 3.9:
https://buildbot.python.org/all/#builders/579/builds/105

AMD64 Arch Linux Asan 3.8:
https://buildbot.python.org/all/#builders/580/builds/86

IMO we should disable ASAN (handling of signals) at runtime when we trigger a crash on purpose (ex: faulthandler._sigsegv()).
msg385417 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-21 13:44
About SIGSEGV logs, one option is to use ASAN_OPTIONS="handle_segv=0".
msg385419 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-21 13:48
Documentation of ASAN_OPTIONS:
https://github.com/google/sanitizers/wiki/SanitizerCommonFlags
https://github.com/google/sanitizers/wiki/AddressSanitizerFlags
msg385428 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-01-21 15:00
> IMO we should disable ASAN (handling of signals) at runtime when we trigger a crash on purpose (ex: faulthandler._sigsegv()).

> ASAN_OPTIONS="handle_segv=0".

Both sound reasonable. But not sure if they will resolve this crash tough.
msg385459 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-22 00:24
> Both sound reasonable. But not sure if they will resolve this crash tough.

Many tests do crash *on purpose*. Example on test_concurrent_futures.py:

def _crash(delay=None):
    """Induces a segfault."""
    if delay:
        time.sleep(delay)
    import faulthandler
    faulthandler.disable()
    faulthandler._sigsegv()

Internally, faulthandler._sigsegv() disables crash reports.

There is also test.support.SuppressCrashReport context manager to disable crash reports. But I failed to find a way to disable ASAN signal handler at runtime.

It's possible to disable the ASAN signal handler at runtime using signal.signal(SIGSEGV, signal.SIG_DFT), but that should be done before Python installs its own signal handler for that, like before calling faulthandler.enable(). It would require to inject code in test_faulthandler to restore the default handler *before* calling faulthandler.enable(). Right now, test_faulthandler is skipped on the ASAN buildbots.
msg385469 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-01-22 01:52
> About SIGSEGV logs, one option is to use ASAN_OPTIONS="handle_segv=0".

Opened https://github.com/python/buildmaster-config/pull/222
msg385487 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-22 10:16
For faulthandler.enable(), maybe we reset SIGSEGV signal handler to the default handler if __has_feature(address_sanitizer) is true:
https://clang.llvm.org/docs/AddressSanitizer.html#conditional-compilation-with-has-feature-address-sanitizer

But we cannot do that in faulthandler._sigsegv() since this function is used to test_faulthandler to check the signal handler installed by faulthandler previously.

Maybe we should add a function to test.support which resets the signal handler and then trigger a crash.

There are multiple functions which trigger crashes on purpose:

* _testcapi.crash_no_current_thread() => Py_FatalError()
* _testcapi.return_null_without_error() => Py_FatalError()
* _testcapi.return_result_wit_error() => Py_FatalError()
* _testcapi.negative_refcount() => Py_FatalError()
* _testcapi.pymem_buffer_overflow() => Py_FatalError()
* _testcapi.set_nomemory(0) is used to trigger a _PyErr_NormalizeException crash => Py_FatalError()
* etc.

Py_FatalError() calls abort() which raises SIGABRT signal, but ASAN doesn't catch this signal.

More generally, search for support.SuppressCrashReport usage in tests.

See also faulthandler_suppress_crash_report() C function.
msg385697 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 08:54
Pablo added ASAN_OPTIONS=handle_segv=0 env var to his buildbot worker:
https://github.com/python/buildmaster-config/commit/3ae3e1b21a20a06628a225579174e2aa46830583
msg385698 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 09:00
> The AMD64 Arch Linux Asan 3.x buildbot worker started to fail at build 262:
> https://buildbot.python.org/all/#/builders/582/builds/262

It no longer fails, so I close the issue:
https://buildbot.python.org/all/#/builders/582/builds/278
History
Date User Action Args
2021-01-26 09:00:44vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg385698

stage: resolved
2021-01-26 08:54:37vstinnersetmessages: + msg385697
2021-01-22 10:16:04vstinnersetmessages: + msg385487
2021-01-22 01:52:30pablogsalsetmessages: + msg385469
2021-01-22 00:24:37vstinnersetmessages: + msg385459
2021-01-21 15:00:55orsenthilsetnosy: + orsenthil
messages: + msg385428
2021-01-21 13:48:03vstinnersetmessages: + msg385419
2021-01-21 13:44:38vstinnersetmessages: + msg385417
2021-01-21 13:18:05vstinnersetmessages: + msg385416
2021-01-20 21:32:05vstinnersetnosy: + pablogsal
2021-01-20 21:18:02vstinnercreate