This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Insufficient sigaltstack size used by CPython prevents extensions from using new ISA
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: lkollar, lukasz.langa, miss-islington, oleksandr-pavlyk, pablogsal, vstinner
Priority: Keywords: patch

Created on 2022-03-09 19:07 by oleksandr-pavlyk, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 31789 merged oleksandr-pavlyk, 2022-03-09 19:29
PR 31824 merged vstinner, 2022-03-11 16:58
PR 31830 merged vstinner, 2022-03-11 22:22
PR 31831 merged vstinner, 2022-03-11 23:13
PR 31961 merged pablogsal, 2022-03-17 14:25
PR 31974 merged pablogsal, 2022-03-18 12:38
PR 31975 merged pablogsal, 2022-03-18 12:39
Messages (21)
msg414809 - (view) Author: Oleksandr Pavlyk (oleksandr-pavlyk) * Date: 2022-03-09 19:07
The following snippet illustrates request by an extension to use AMX registers:

```
// no increase sigaltstack size fix will not wark till we fix python
void enable_amx_no_fix()
{

    unsigned long bitmask;
    long rc;

    rc = syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, XFEATURE_XTILEDATA);
    if (rc) {
        printf("The kernel rejects the AMX use.\n");
        printf("errno %d\n",errno);
    } else {
        printf("The kernel allows to use AMX.\n");
    }


    rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_PERM, &bitmask);

    if (rc) {
        printf("rc error\n");
    } else  {
        if (( bitmask & XFEATURE_MASK_XTILEDATA) == 0){
            printf("verify AMX permission faild bitmask %ld\n",bitmask);
        }

    }
}
```

This request fails on the account of too small a size for sigaltstack used by CPython allocated in Modules/faulthandler.c

The stack size used is 2*SIGSTKSZ, and does not take hardware capabilities into account.

Linux kernel 5.14 adds support to query minimum size of sigaltstack dynamically via getauxval(AT_MINSIGSTKSZ).

AMX support is added in Linux kernel 5.16 

CPython should make use of this when built against more recent Linux kernels.
msg414938 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 22:01
New changeset dc374ac7b0fabaed49461a2044c220765f48d229 by Victor Stinner in branch 'main':
bpo-46968: Add os.sysconf_names['SC_MINSIGSTKSZ'] (GH-31824)
https://github.com/python/cpython/commit/dc374ac7b0fabaed49461a2044c220765f48d229
msg414939 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 22:19
New changeset 3b128c054885fe881c3b57a5978de3ea89c81a9c by Oleksandr Pavlyk in branch 'main':
bpo-46968: Fix faulthandler for Sapphire Rapids Xeon (GH-31789)
https://github.com/python/cpython/commit/3b128c054885fe881c3b57a5978de3ea89c81a9c
msg414941 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 22:30
The latest faulthandler bug on sigaltstack() was bpo-21131 about FPU state stored by "XSAVE" and a Linux kernel change to fix a security vulnerability:
https://bugs.python.org/issue21131#msg349688
msg414943 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 23:04
New changeset 393e2bf6bc6effbfe821f051a230978f0edd70df by Victor Stinner in branch '3.10':
bpo-46968: Fix faulthandler for Sapphire Rapids Xeon (GH-31789) (GH-31830)
https://github.com/python/cpython/commit/393e2bf6bc6effbfe821f051a230978f0edd70df
msg414949 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 23:37
New changeset ba2b7956fa3932769a5c0aa2575de5c8d7e7ba4b by Victor Stinner in branch '3.9':
bpo-46968: Fix faulthandler for Sapphire Rapids Xeon (GH-31789) (GH-31831)
https://github.com/python/cpython/commit/ba2b7956fa3932769a5c0aa2575de5c8d7e7ba4b
msg414950 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-11 23:38
Thanks Oleksandr Pavlyk for your nice enhancement!

I merged your change to 3.9, 3.10 and main branches! I also added os.sysconf('SC_MINSIGSTKSZ') to Python 3.11.
msg414957 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-12 00:47
On my x86-64 Fedora 35 with the CPU "Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz", I get:

* SIGSTKSZ = 8192
* getauxval(AT_MINSIGSTKSZ) = 2032
* faulthandler stack.ss_size = 10224
* os.sysconf('SC_MINSIGSTKSZ') = 2032

In C, sysconf(_SC_MINSIGSTKSZ) is similar to getauxval(AT_MINSIGSTKSZ), but the glibc sysconf(_SC_MINSIGSTKSZ) is more generic, whereas IMO getauxval(AT_MINSIGSTKSZ) is more the low-level Linux API.
msg414960 - (view) Author: Oleksandr Pavlyk (oleksandr-pavlyk) * Date: 2022-03-12 01:10
So where getauxval(AT_MINSIGSTKSZ) < SIGSTKSZ the merged changes actually resulted in decrease of the allocated signal deliver stack. 

On Sapphire Rapids due to tile registers size we have getauxval(AT_MINSIGSTKSZ) > SIGSTKSZ.

This is why the initial proposal was to use SIGSTKSZ + max(getauxval(AT_MINSIGSTKSZ), SIGSTKSZ).

I suppose, ultimately I am saying that we should check that bpo-21131 does not regress.
msg414965 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-12 01:31
> So where getauxval(AT_MINSIGSTKSZ) < SIGSTKSZ the merged changes actually resulted in decrease of the allocated signal deliver stack. 

faulthandler stack is only used in the least likely case: on a fatal error. It should reduce its memory footprint, so it's good that it uses less memory.

To fix bpo-21131, I chose to use SIGSTKSZ * 2 because:

* the fix is trivial: add "* 2"
* it "just works"
* there was no API to query how many bytes Linux uses on the stack

Previously, faulthandler had between 0 and SIGSTKSZ bytes depending on which CPU extension was used or not on the process.

If I understood correctly, on Linux 5.14 with the change, faulthandler now always has SIGSTKSZ bytes for its own usage, and the other bytes are used by the Linux kernel. So it's more reliable for faulthandler, to always have the same amount of memory for its personal use.
msg415402 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 12:48
Commit 393e2bf6bc6effbfe821f051a230978f0edd70df has broken CPython in RedHat 6:


[2022-03-16T18:49:20.608Z] 2022/03/16 14:48:55 ERROR /tmp/python3.10-3.10.3-0/Modules/faulthandler.c:28:12: fatal error: sys/auxv.h: No such file or directory
#  include <sys/auxv.h>
^~~~~~~~~~~~
compilation terminated.
msg415403 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 12:49
This is problematic because this has been backported to stable releases.
msg415404 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 12:49
We may need to revert these commits and do another release.... sigh :(
msg415405 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 12:54
The configure file is checking for "linux/auxvec.h"

checking linux/auxvec.h usability... yes

but the code is including "sys/auxvec.h"
msg415406 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 12:55
The code is assuming that if linux/auxvec.h then sys/auxv.h will be available, which is wrong.
msg415407 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-17 13:02
This may be quite bad, because this means that 3.10 and 3.9 doesn't build in CentOS 6, which is used for manylinux2010 wheels
msg415483 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-18 10:33
> Commit 393e2bf6bc6effbfe821f051a230978f0edd70df has broken CPython in RedHat 6

Too bad that we want to support RHEL 6 but have no buildbot for that.
msg415487 - (view) Author: miss-islington (miss-islington) Date: 2022-03-18 12:03
New changeset 8e3fde728f547f1d32bde8adf62b4c50bb877b9d by Pablo Galindo Salgado in branch 'main':
bpo-46968: Check for 'sys/auxv.h' in the configure script (GH-31961)
https://github.com/python/cpython/commit/8e3fde728f547f1d32bde8adf62b4c50bb877b9d
msg415496 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-18 13:33
New changeset a12ef81231d65da5efbef4fa1434716270a19af6 by Pablo Galindo Salgado in branch '3.9':
[3.9] bpo-46968: Check for 'sys/auxv.h' in the configure script (GH-31961). (GH-31975)
https://github.com/python/cpython/commit/a12ef81231d65da5efbef4fa1434716270a19af6
msg415497 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-03-18 13:33
New changeset 6fd9737373f2bed03f409440b4fd50b9f8f121cb by Pablo Galindo Salgado in branch '3.10':
[3.10] bpo-46968: Check for 'sys/auxv.h' in the configure script (GH-31961). (GH-31974)
https://github.com/python/cpython/commit/6fd9737373f2bed03f409440b4fd50b9f8f121cb
msg415500 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-18 14:15
Thanks for the fix Pablo.
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91124
2022-03-18 14:15:32vstinnersetpriority: release blocker ->

messages: + msg415500
2022-03-18 13:33:30pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2022-03-18 13:33:14pablogsalsetmessages: + msg415497
2022-03-18 13:33:11pablogsalsetmessages: + msg415496
2022-03-18 12:39:42pablogsalsetpull_requests: + pull_request30065
2022-03-18 12:38:00pablogsalsetpull_requests: + pull_request30064
2022-03-18 12:05:08lkollarsetnosy: + lkollar
2022-03-18 12:03:28miss-islingtonsetnosy: + miss-islington
messages: + msg415487
2022-03-18 10:33:55vstinnersetmessages: + msg415483
2022-03-17 14:25:27pablogsalsetstage: resolved -> patch review
pull_requests: + pull_request30049
2022-03-17 13:02:03pablogsalsetmessages: + msg415407
2022-03-17 12:56:21pablogsalsetpriority: normal -> release blocker
2022-03-17 12:55:19pablogsalsetmessages: + msg415406
2022-03-17 12:54:32pablogsalsetmessages: + msg415405
2022-03-17 12:49:59pablogsalsetmessages: + msg415404
2022-03-17 12:49:44pablogsalsetnosy: + lukasz.langa
messages: + msg415403
2022-03-17 12:48:15pablogsalsetstatus: closed -> open

nosy: + pablogsal
messages: + msg415402

resolution: fixed -> (no value)
2022-03-12 01:31:53vstinnersetmessages: + msg414965
2022-03-12 01:10:23oleksandr-pavlyksetmessages: + msg414960
2022-03-12 00:47:37vstinnersetmessages: + msg414957
2022-03-11 23:38:59vstinnersetstatus: open -> closed
versions: - Python 3.7, Python 3.8
messages: + msg414950

resolution: fixed
stage: patch review -> resolved
2022-03-11 23:37:38vstinnersetmessages: + msg414949
2022-03-11 23:13:06vstinnersetpull_requests: + pull_request29928
2022-03-11 23:04:29vstinnersetmessages: + msg414943
2022-03-11 22:30:19vstinnersetmessages: + msg414941
2022-03-11 22:22:48vstinnersetpull_requests: + pull_request29927
2022-03-11 22:19:43vstinnersetmessages: + msg414939
2022-03-11 22:01:44vstinnersetmessages: + msg414938
2022-03-11 16:58:08vstinnersetpull_requests: + pull_request29921
2022-03-10 06:32:44kumaradityasetnosy: + vstinner
2022-03-09 19:29:32oleksandr-pavlyksetkeywords: + patch
stage: patch review
pull_requests: + pull_request29893
2022-03-09 19:07:02oleksandr-pavlykcreate