classification
Title: Signal stack overflow in faulthandler_user
Type: resource usage Stage: resolved
Components: Extension Modules Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cheryl.sabella, schwab, vstinner
Priority: normal Keywords:

Created on 2014-09-27 12:41 by schwab, last changed 2019-03-26 12:55 by vstinner. This issue is now closed.

Messages (5)
msg227667 - (view) Author: Andreas Schwab (schwab) * Date: 2014-09-27 12:41
test_register_chain fails on aarch64 due to signal stack overflow, when re-raising the signal in faulthandler_user.  The problem is that the signal stack can only handle a single signal frame, but faulthandler_user adds a second one.  _Py_Faulthandler_Init should allocate twice the amount of stack to cater for the two signal frames.

======================================================================
FAIL: test_register_chain (test.test_faulthandler.FaultHandlerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/abuild/rpmbuild/BUILD/Python-3.4.1/Lib/test/test_faulthandler.py", line 592, in test_register_chain
    self.check_register(chain=True)
  File "/home/abuild/rpmbuild/BUILD/Python-3.4.1/Lib/test/test_faulthandler.py", line 576, in check_register
    self.assertEqual(exitcode, 0)
AssertionError: -11 != 0

----------------------------------------------------------------------
msg227754 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-09-28 08:51
_PyFaulthandler_Init() uses sigaltstack() with a stack of SIGSTKSZ bytes. On my Linux/x86_64, SIGSTKSZ is 8 KB.

What is the value of SIGSTKSZ on aarch64? Is there a C define (#ifdef) to use a different size on this architecture? Does the test pass if you modify faulthandler.c to use "SIGSTKSZ * 2"?
msg227756 - (view) Author: Andreas Schwab (schwab) * Date: 2014-09-28 11:43
There is an open bug about MINSIGSTKSZ being too small on aarch64 <https://sourceware.org/bugzilla/show_bug.cgi?id=16850>.
How much SIGSTKSZ can guarantee about nested signals is unclear.  POSIX does not appear give any guidance.  On aarch64 SIGSTKSZ is defined to 8192, which is the default for architectures not overriding it (both in glibc and the kernel headers).
msg338733 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2019-03-24 13:37
The bug ticket link provided by @schwab was resolved as closed in 2015.  Is this ticket still an issue on aarch64?  

Other tickets with same error on other platforms: Issue35484, Issue21131
msg338883 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-26 12:55
Python 3 is built frequently on the Fedora infra on AArch64 and the test_faulthandler test doesn't fail there. Recent example of build:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1236594

Direct link to AArch64 build logs (build.log):
https://kojipkgs.fedoraproject.org//packages/python3/3.7.2/5.fc29/data/logs/aarch64/build.log

Extract:

0:02:37 load avg: 5.99 [177/414] test_faulthandler passed (41 sec 925 ms) -- running: test_concurrent_futures (1 min 30 sec)
...
0:03:10 load avg: 10.21 [190/414] test_faulthandler passed (1 min 2 sec) -- running: test_gdb (44 sec 52 ms), test_concurrent_futures (1 min 56 sec)

The test is run on a Python compiled in release mode then on a Python compiled in debug mode. The test pass in both cases.

I close the issue. It seems like the bug has been fixed indirectly since this bug has been reported.

Thanks for your bug report Andreas Schwab :-)
History
Date User Action Args
2019-03-26 12:55:39vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg338883

stage: resolved
2019-03-24 13:37:45cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg338733
2014-09-28 11:43:22schwabsetmessages: + msg227756
2014-09-28 08:51:05vstinnersetmessages: + msg227754
2014-09-27 16:01:33r.david.murraysetnosy: + vstinner
2014-09-27 12:41:37schwabcreate