classification
Title: python3 resource.setrlimit strange behaviour under macOS
Type: behavior Stage: resolved
Components: Library (Lib), macOS Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: inada.naoki, lukasz.langa, marche147, miss-islington, ned.deily, ronaldoussoren, v2m, yan12125
Priority: Keywords: patch

Created on 2018-09-07 10:34 by marche147, last changed 2019-07-02 22:34 by ned.deily. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 13011 merged ned.deily, 2019-04-29 18:48
PR 13013 merged miss-islington, 2019-04-29 19:08
PR 13014 merged miss-islington, 2019-04-29 19:32
PR 14546 merged ned.deily, 2019-07-02 06:55
PR 14547 merged miss-islington, 2019-07-02 07:12
PR 14548 merged miss-islington, 2019-07-02 07:12
PR 14549 merged miss-islington, 2019-07-02 07:13
Messages (16)
msg324731 - (view) Author: marche147 (marche147) Date: 2018-09-07 10:34
Consider the following code:

```
import resource
s, h = resource.getrlimit(resource.RLIMIT_STACK)
resource.setrlimit(resource.RLIMIT_STACK, (h, h))
```

Running this under macOS with python 3.6.5 gives the following exception:

```
bash-3.2$ uname -a
Darwin arch-osx 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64
bash-3.2$ cat test.py
import resource
s, h = resource.getrlimit(resource.RLIMIT_STACK)
resource.setrlimit(resource.RLIMIT_STACK, (h, h))
bash-3.2$ python3 test.py
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    resource.setrlimit(resource.RLIMIT_STACK, (h, h))
ValueError: current limit exceeds maximum limit
```

Nevertheless, when using python 2.7.10 under the same environment, this code works perfectly without exceptions being thrown. Additionally, neither of these operations fail under the same circumstances :

```
bash-3.2$ cat test.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/resource.h>

int main() {
  struct rlimit rl;
  if(getrlimit(RLIMIT_STACK, &rl) < 0) {
    perror("getrlimit");
    exit(1);
  }

  rl.rlim_cur = rl.rlim_max;
  if(setrlimit(RLIMIT_STACK, &rl) < 0) {
    perror("setrlimit");
    exit(1);
  }
  return 0;
}
bash-3.2$ gcc -o test test.c
bash-3.2$ ./test
```

```
bash-3.2$ ulimit -s -H
65532
bash-3.2$ ulimit -s
8192
bash-3.2$ ulimit -s 65532
bash-3.2$ ulimit -s
65532
```

I have also tried to run the above-mentioned python script on linux, also it does not generate exceptions both on python2 (2.7.10) & python3 (3.6.5).
msg324737 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2018-09-07 12:43
I get the same error, also with python3.7. Both for homebrew and a python.org installer.
msg324818 - (view) Author: Vladimir Matveev (v2m) * Date: 2018-09-08 02:56
I can repro it with a given sample file 
```
vladima-mbp $ cat test.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/resource.h>

int main() {
  struct rlimit rl;
  if(getrlimit(RLIMIT_STACK, &rl) < 0) {
    perror("getrlimit");
    exit(1);
  }

  rl.rlim_cur = rl.rlim_max;
  if(setrlimit(RLIMIT_STACK, &rl) < 0) {
    perror("setrlimit");
    exit(1);
  }
  return 0;
}vladima-mbp $ gcc   -Wl,-stack_size,1000000   -o test test.c
vladima-mbp $ ./test
setrlimit: Invalid argument
```
Similar settings were added to Python in https://github.com/python/cpython/commit/335ab5b66f4
msg324824 - (view) Author: marche147 (marche147) Date: 2018-09-08 09:05
Thanks for the repro! It did help for pinpointing the issue.

So I took a little spare time and dived into xnu kernel code, here is my assumption based on what I found (N.B. : My assumption comes from a simple experiment and a brief skim of the source code within 15 minutes or less, it could be seriously wrong since I'm not an expert of XNU kernel, and I currently don't have the time to build and debug it.) :

In bsd/kern/kern_resource.c, there's a function `dosetrlimit` which handles the `setrlimit` request, and here's part of it:

```
  case RLIMIT_STACK:
    // ...
    if (limp->rlim_cur > alimp->rlim_cur) {
      user_addr_t addr;
      user_size_t size;

        /* grow stack */
        size = round_page_64(limp->rlim_cur);
        size -= round_page_64(alimp->rlim_cur);

      addr = p->user_stack - round_page_64(limp->rlim_cur);
      kr = mach_vm_protect(current_map(),
               addr, size,
               FALSE, VM_PROT_DEFAULT);
      if (kr != KERN_SUCCESS) {
        error =  EINVAL;
        goto out;
      }
    } // ...

```

As we can see, the kernel will try to `mprotect` the memory preceding the stack to `VM_PROT_DEFAULT` (presumably read & write). I then used `vmmap` to see the difference between two binaries compiled with different commands. And the results are : 

1. Binary compiled without default stack size:

```
- Before calling setrlimit

...
STACK GUARD            00007ffee76d9000-00007ffeeaed9000 [ 56.0M     0K     0K     0K] ---/rwx SM=NUL          stack guard for thread 0
...
Stack                  00007ffeeaed9000-00007ffeeb6d9000 [ 8192K    20K    20K     0K] rw-/rwx SM=PRV          thread 0
...
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced)
===========                     ======= ========    =====  ======= ========   ======    =====  =======
Kernel Alloc Once                    8K       4K       4K       0K       0K       0K       0K        2
MALLOC guard page                   16K       0K       0K       0K       0K       0K       0K        5
MALLOC metadata                     60K      60K      60K       0K       0K       0K       0K        6
MALLOC_SMALL                      16.0M      16K      16K       0K       0K       0K       0K        3         see MALLOC ZONE table below
MALLOC_TINY                       2048K      32K      32K       0K       0K       0K       0K        3         see MALLOC ZONE table below
STACK GUARD                       56.0M       0K       0K       0K       0K       0K       0K        2
Stack                             8192K      20K      20K       0K       0K       0K       0K        2
__DATA                            2324K    1192K     208K       0K       0K       0K       0K       43
__LINKEDIT                       192.7M    21.7M       0K       0K       0K       0K       0K        4
__TEXT                            9448K    8224K       0K       0K       0K       0K       0K       48
shared memory                        8K       8K       8K       0K       0K       0K       0K        3
===========                     ======= ========    =====  ======= ========   ======    =====  =======
TOTAL                            286.3M    31.0M     348K       0K       0K       0K       0K      110
...

- After calling setrlimit

...
STACK GUARD            00007ffee76d9000-00007ffee76da000 [    4K     0K     0K     0K] ---/rwx SM=NUL          stack guard for thread 0
...
Stack                  00007ffee76da000-00007ffeeaed9000 [ 56.0M     0K     0K     0K] rw-/rwx SM=NUL          thread 0
Stack                  00007ffeeaed9000-00007ffeeb6d9000 [ 8192K    20K    20K     0K] rw-/rwx SM=PRV          thread 0
...
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced)
===========                     ======= ========    =====  ======= ========   ======    =====  =======
Kernel Alloc Once                    8K       4K       4K       0K       0K       0K       0K        2
MALLOC guard page                   16K       0K       0K       0K       0K       0K       0K        5
MALLOC metadata                     60K      60K      60K       0K       0K       0K       0K        6
MALLOC_SMALL                      16.0M      20K      20K       0K       0K       0K       0K        3         see MALLOC ZONE table below
MALLOC_TINY                       2048K      32K      32K       0K       0K       0K       0K        3         see MALLOC ZONE table below
STACK GUARD                          4K       0K       0K       0K       0K       0K       0K        2
Stack                             64.0M      20K      20K       0K       0K       0K       0K        3
__DATA                            2324K    1192K     208K       0K       0K       0K       0K       43
__LINKEDIT                       192.7M    21.7M       0K       0K       0K       0K       0K        4
__TEXT                            9448K    8224K       0K       0K       0K       0K       0K       48
shared memory                        8K       8K       8K       0K       0K       0K       0K        3
===========                     ======= ========    =====  ======= ========   ======    =====  =======
TOTAL                            286.3M    31.0M     352K       0K       0K       0K       0K      111
...
```

2. Binary compiled with default stack size:

```
Before calling setrlimit :
...
STACK GUARD            00007ffee09c2000-00007ffee09c3000 [    4K     0K     0K     0K] ---/rwx SM=NUL          stack guard for thread 0
...
Stack                  00007ffee09c3000-00007ffee19c3000 [ 16.0M    20K    20K     0K] rw-/rwx SM=PRV          thread 0
...
                                VIRTUAL RESIDENT    DIRTY  SWAPPED VOLATILE   NONVOL    EMPTY   REGION
REGION TYPE                        SIZE     SIZE     SIZE     SIZE     SIZE     SIZE     SIZE    COUNT (non-coalesced)
===========                     ======= ========    =====  ======= ========   ======    =====  =======
Kernel Alloc Once                    8K       4K       4K       0K       0K       0K       0K        2
MALLOC guard page                   16K       0K       0K       0K       0K       0K       0K        5
MALLOC metadata                     60K      60K      60K       0K       0K       0K       0K        6
MALLOC_SMALL                      8192K      12K      12K       0K       0K       0K       0K        2         see MALLOC ZONE table below
MALLOC_TINY                       1024K      20K      20K       0K       0K       0K       0K        2         see MALLOC ZONE table below
STACK GUARD                          4K       0K       0K       0K       0K       0K       0K        2
Stack                             16.0M      20K      20K       0K       0K       0K       0K        2
__DATA                            2324K    1192K     208K       0K       0K       0K       0K       43
__LINKEDIT                       192.7M    22.3M       0K       0K       0K       0K       0K        4
__TEXT                            9448K    8232K       0K       0K       0K       0K       0K       48
shared memory                        8K       8K       8K       0K       0K       0K       0K        3
===========                     ======= ========    =====  ======= ========   ======    =====  =======
TOTAL                            229.3M    31.7M     332K       0K       0K       0K       0K      108
```

As we can see, it seems that the kernel tried to `mprotect` (or we can say, allocate) from the "STACK GUARD" region. So where does this "STACK GUARD" region comes from? Let's see this:

bsd/kern/kern_exec.c, in `create_unix_stack` function (where the kernel creates the stack for a new task, I assume) :

```
...
#define unix_stack_size(p)  (p->p_rlimit[RLIMIT_STACK].rlim_cur)
...
    if (load_result->user_stack_size == 0) {
      load_result->user_stack_size = unix_stack_size(p);
      prot_size = mach_vm_trunc_page(size - load_result->user_stack_size);
    } else {
      prot_size = PAGE_SIZE;
    }

    prot_addr = addr;
    kr = mach_vm_protect(map,
             prot_addr,
             prot_size,
             FALSE,
             VM_PROT_NONE);
  ...
```

So that comes my conclusion: if the binary has a specified default stack size, this `load_result->user_stack_size` would not be zero (this should be set somewhere inside the mach-o parser/loader, I guess), so the kernel will only map a small page for the "STACK GUARD" region, otherwise the kernel will use the current stack size soft limit (inherited from the parent) as the `user_stack_size` and calculates a `prot_size`, which should be (rlim_max - rlim_cur). And of course, the python3 binary was built with default stack size, so the kernel does not provide a huge enough "STACK GUARD" region for the `setrlimit` syscall to allot more stack space than the default stack size.
msg338868 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-03-26 09:39
https://bugs.python.org/issue34602 may be relating to this.
msg339197 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-03-30 13:50
I guess Inada Naoki was to say https://bugs.python.org/issue36432 in the last comment.
msg341112 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-04-29 19:07
New changeset 883dfc668f9730b00928730035b5dbd24b9da2a0 by Ned Deily in branch 'master':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-13011)
https://github.com/python/cpython/commit/883dfc668f9730b00928730035b5dbd24b9da2a0
msg341114 - (view) Author: miss-islington (miss-islington) Date: 2019-04-29 19:27
New changeset 52a5b71063af68c42b048095c4e555e93257f151 by Miss Islington (bot) in branch '3.7':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-13011)
https://github.com/python/cpython/commit/52a5b71063af68c42b048095c4e555e93257f151
msg341118 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-04-29 19:57
New changeset fbe2a1394bf52f5a4455681e1b1f705a31559585 by Ned Deily (Miss Islington (bot)) in branch '3.6':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-13011) (GH-13014)
https://github.com/python/cpython/commit/fbe2a1394bf52f5a4455681e1b1f705a31559585
msg341120 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-04-29 20:12
Thanks for the analyses everyone!  Also see the discussion in duplicate Issue36432.  I was never able to reproduce the failure on earlier versions of macOS but then it seemed to become a hard failure with the release of 10.14.4.  I haven't gone back and tried running the tests on all supported older versions with this reversion in place but those I have did not exhibit any new failures.  So we should keep an eye open for reports of segfaults running tests as originally reported in Issue18075.  But better that then not being able to run any tests.  "Fixed" in 3.8.0a4, 3.7.4, and 3.6.9 (to allow tests to be run on macOS) by reverting the change for Issue18075.
msg347110 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-07-02 06:37
> So we should keep an eye open for reports of segfaults running tests as originally reported in Issue18075.

Now there are such reports for 3.7.4rc1, at least. Release builds seem to be OK but at least four tests now fail on macOS 10.14.5 when built --with-pydebug: test_exceptions test_json test_logging test_sys test_traceback.  Plus, changing the interpreters stack size can inhibit use of dtrace.

There are really two attempts at dealing with macOS's small default stack size.  The original workaround, dating back to 2002-12-02 (bb48465273d2aa98fc7669e99b0d5fb1c57962de !!) added the runtime calls in regrtest to change RLIMIT_STACK. Years later, a different approach was taken in the original change for Issue18075 (335ab5b66f432ae3713840ed2403a11c368f5406) by increasing the default stack size when building the interpreter rather than at runtime.  So, I *think* that means that the original regrtest runtime workaround is really no longer needed.  So I think we should take the opposite approach to what I originally merged back in April, that is, we should go back to building the interpreter with the increased stack size and remove the ancient regrtest workaround attempt: that's what was failing here anyway.  The only place in the source base that resource.RLIMIT_STACK is used, outside of its functional test, is the old regrtest workaround.  It's still a bit of a mystery as what has changed in 3.8 that seems to not hit the stack size limit but these kind of test failures on macOS have always been dependent on other factors, including compiler version and options.  I'd like to get this into 3.8.0b2 and 3.7.4rc2 ao I'm temporarily making it a "release blocker"; the PR will follow momentarily.
msg347113 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-07-02 07:12
New changeset 5bbbc733e6cc0804f19b071944af8d4719e26ae6 by Ned Deily in branch 'master':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-14546)
https://github.com/python/cpython/commit/5bbbc733e6cc0804f19b071944af8d4719e26ae6
msg347115 - (view) Author: miss-islington (miss-islington) Date: 2019-07-02 07:31
New changeset bd92b94da93198c8385c06ca908407f172c7e8b2 by Miss Islington (bot) in branch '3.8':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-14546)
https://github.com/python/cpython/commit/bd92b94da93198c8385c06ca908407f172c7e8b2
msg347117 - (view) Author: miss-islington (miss-islington) Date: 2019-07-02 07:38
New changeset bf82cd3124df94935c6e3190c7c40b76918d2174 by Miss Islington (bot) in branch '3.7':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-14546)
https://github.com/python/cpython/commit/bf82cd3124df94935c6e3190c7c40b76918d2174
msg347119 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-07-02 07:49
New changeset 782854f90ad5f73f787f68693d535f2b05514e13 by Ned Deily (Miss Islington (bot)) in branch '3.6':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-14546) (GH-14549)
https://github.com/python/cpython/commit/782854f90ad5f73f787f68693d535f2b05514e13
msg347167 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-07-02 22:34
New changeset dcc0eb379613f279864af61023ea44c94aa0535c by Ned Deily (Miss Islington (bot)) in branch '3.7':
bpo-34602: Avoid failures setting macOS stack resource limit (GH-14546)
https://github.com/python/cpython/commit/dcc0eb379613f279864af61023ea44c94aa0535c
History
Date User Action Args
2019-07-02 22:34:04ned.deilysetmessages: + msg347167
2019-07-02 07:50:14ned.deilysetpriority: release blocker ->
status: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-07-02 07:49:02ned.deilysetmessages: + msg347119
2019-07-02 07:38:41miss-islingtonsetmessages: + msg347117
2019-07-02 07:31:13miss-islingtonsetmessages: + msg347115
2019-07-02 07:13:05miss-islingtonsetpull_requests: + pull_request14365
2019-07-02 07:12:52miss-islingtonsetpull_requests: + pull_request14363
2019-07-02 07:12:43miss-islingtonsetpull_requests: + pull_request14361
2019-07-02 07:12:37ned.deilysetmessages: + msg347113
2019-07-02 06:55:43ned.deilysetstage: needs patch -> patch review
pull_requests: + pull_request14359
2019-07-02 06:37:53ned.deilysetstatus: closed -> open
priority: normal -> release blocker

versions: + Python 3.9
nosy: + lukasz.langa

messages: + msg347110
resolution: fixed -> (no value)
stage: resolved -> needs patch
2019-04-29 20:12:07ned.deilysetstatus: open -> closed
versions: + Python 3.8
messages: + msg341120

resolution: fixed
stage: patch review -> resolved
2019-04-29 19:57:20ned.deilysetmessages: + msg341118
2019-04-29 19:53:24ned.deilylinkissue36432 superseder
2019-04-29 19:32:07miss-islingtonsetpull_requests: + pull_request12936
2019-04-29 19:27:39miss-islingtonsetnosy: + miss-islington
messages: + msg341114
2019-04-29 19:08:00miss-islingtonsetpull_requests: + pull_request12934
2019-04-29 19:07:44ned.deilysetmessages: + msg341112
2019-04-29 18:48:20ned.deilysetkeywords: + patch
stage: patch review
pull_requests: + pull_request12932
2019-03-30 13:50:15yan12125setnosy: + yan12125
messages: + msg339197
2019-03-26 09:39:36inada.naokisetnosy: + inada.naoki
messages: + msg338868
2018-09-08 09:05:28marche147setmessages: + msg324824
2018-09-08 02:56:34v2msetnosy: + v2m
messages: + msg324818
2018-09-07 12:43:06ronaldoussorensetversions: + Python 3.7
nosy: + ned.deily, ronaldoussoren

messages: + msg324737

components: + macOS
2018-09-07 10:34:46marche147create