classification
Title: Re-enable threading test on macOS
Type: behavior Stage: resolved
Components: macOS Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ronaldoussoren Nosy List: Michael.Felt, aeros, matrixise, miss-islington, ned.deily, ronaldoussoren, steve.dower, vstinner, xtreak
Priority: normal Keywords: needs review, patch

Created on 2013-05-24 14:27 by ronaldoussoren, last changed 2019-09-06 14:30 by ronaldoussoren. This issue is now closed.

Files
File name Uploaded Description Edit
reenable-threading-test.txt ronaldoussoren, 2013-05-24 14:27 review
Pull Requests
URL Status Linked Edit
PR 14748 merged ronaldoussoren, 2019-07-13 14:34
PR 15065 merged miss-islington, 2019-08-01 05:43
PR 15075 closed ronaldoussoren, 2019-08-02 05:36
PR 15081 merged Michael.Felt, 2019-08-02 17:01
PR 15089 closed miss-islington, 2019-08-03 07:18
PR 15572 merged miss-islington, 2019-08-29 05:17
Messages (30)
msg189912 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-05-24 14:27
I recently notied that test.test_threading.ThreadingExceptionTests.test_recursion_limit is disabled for debug builds. 

The attached patch re-enables this test case and fixes the crash be increasing the stack size for new threads. 

The disadvantage of the patch is that is increases the default stack size for all new threads on OSX from 5 to 8 megabytes (for threads created by Python) and that could affect program behavior as fewer threads can be created before running out of memory.

The patch could be updated to check for Py_DEBUG and only increase the stack size for --with-pydebug builds (which are the ones using -O0), as those builds need the increased stack size and debug builds already consume more memory that normal builds. That's not 100% reliable though, as users could also build the with CFLAGS set to -O0 without using --with-pydebug. 

I haven't found a way to detect the optimization level with clang or gcc using the preprocessor.
msg192389 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-06 08:39
I intend to apply this patch to default (and only that branch) after running the testsuite and with a small change: the stack size will be 0x1000000 (double it current value) to sync it with the stack size for the main thread, which was recently set to that value in the default branch.
msg270766 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2016-07-18 15:50
for my part, I think you should use a #ifdef PY_DEBUG in your code and increase the default size of the stack for the threads, and only if you are on APPLE and PY_DEBUG mode.
msg271071 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2016-07-23 10:56
The patch increases the stack size for threads to slightly less than size as is used for the main thread, see this fragment in configure.ac:

   # Issue #18075: the default maximum stack size (8MBytes) is too
   # small for the default recursion limit. Increase the stack size
   # to ensure that tests don't crash
   LINKFORSHARED="-Wl,-stack_size,1000000 $LINKFORSHARED"

Maybe the patch should be updated to use the same size? 

The disadvantage of any increase of size is that this reduces the number of threads that can be used, in particular for 32-bit builds (64-bit builds have more than enough address space)
msg272116 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2016-08-07 09:38
yes, maybe, could you provide a patch ?
msg323757 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2018-08-19 15:15
I have a patch for making the stack sizes the same in Issue34264.
msg347830 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-07-13 14:38
I've converted the patch to a pull request (and closed to other issue because that has a patch that is less complete).

I'm not sure if it is acceptable to backport this patch to 3.8 at this time.
msg347868 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-07-14 07:44
I believe this is acceptable to include in 3.8, provided it makes sense for 3.9, so I added the backport tag on the PR.

That said, I'm not making any comment on whether it's appropriate at all :)

For Windows we found some ways to reduce the stack usage of debug builds (IIRC, refactoring some large local variables into their own function) and improved recursion depth that way. Without any data on whether worker threads generally go deeper than the main thread, I would hesitate to say that they ought to be the same, as there is a definite advantage to having a smaller stack by default when you have many worker threads.

But if someone else confidently approves the change, it can go into 3.8.
msg347888 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-07-14 08:38
I've done some more testing, and the thread stack size could be smaller than 16MB, but definitely not as small as the value that's currently in thread_pthread.h.

I prefer to have a slightly too large value here because IIRC the relation between the recursion limit and C stack usage is a heuristic. 

BTW. It should be possible to implement PyOS_CheckStack() on macOS. There are (undocumented) APIs in the public header files to get the stack base and size for the current thread. These APIs also work for the main thread:

# ---

#include <pthread.h>
#include <stdio.h>

int main(void)
{
        printf("%#lx\n", (long)pthread_get_stacksize_np(pthread_self()));
        printf("%p\n", pthread_get_stackaddr_np(pthread_self()));
}

# ---
msg347901 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-07-14 09:27
W.r.t. PyOS_CheckStack: I had that idea last year as well, and someone contributed a pull request, see issue33955.
msg348839 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-01 05:43
New changeset 1a057bab0f18d6ad843ce321d1d77a4819497ae4 by Ronald Oussoren in branch 'master':
bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748)
https://github.com/python/cpython/commit/1a057bab0f18d6ad843ce321d1d77a4819497ae4
msg348847 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-01 09:16
I'm not sure if it's related, but test_threading.test_recursion_limit() started to crash on two AIX buildbot workers after commit 1a057bab0f18d6ad843ce321d1d77a4819497ae4:
https://bugs.python.org/issue36273#msg348845
msg348860 - (view) Author: miss-islington (miss-islington) Date: 2019-08-01 14:39
New changeset 8399641c34d8136c3151fda6461cc4727a20b28e by Miss Islington (bot) in branch '3.8':
bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748)
https://github.com/python/cpython/commit/8399641c34d8136c3151fda6461cc4727a20b28e
msg348875 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-01 20:00
That is related, my patch removes a guard that should enable this test only on macOS.

This fragment is the issue:
- @unittest.skipUnless(sys.platform == 'darwin' and test.support.python_is_optimized(),
- 	                         'test macosx problem')

I'll create a pull request for this.

BTW. This does point to an issue with AIX though, this means it is possible to crash threads by deep recursion on AIX. But that's separate from this issue.
msg348876 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-01 20:08
The patch to fix this should be:

diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py
index 1466d25e94..658bc1c7a0 100644
--- a/Lib/test/test_threading.py
+++ b/Lib/test/test_threading.py
@@ -1057,6 +1057,7 @@ class ThreadingExceptionTests(BaseTestCase):
         lock = threading.Lock()
         self.assertRaises(RuntimeError, lock.release)
 
+    @unittest.skipUnless(sys.platform == 'darwin', 'test macos problem')
     def test_recursion_limit(self):
         # Issue 9670
         # test that excessive recursion within a non-main thread causes

---

PR will follow after I've run tests locally, I'll probably create the PR in the morning.
msg348887 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-02 05:51
PR 15075 should fix this issue. As mentioned in that PR I do have a question w.r.t. the exact shape of that patch: I currently reintroduce a "skipUnless" decorator that only runs this test on macOS, I could also introduce one that just disables the test on AIX as that's where the test fails.

P.S. According to this old developer works blog AIX has a very small default stack size for threads (like more BSD systems) and also has an API to change the size of those stacks.  It's probably worthwhile to add some code to thread_pthread.h to do this.

The blog: https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/don_t_forget_about_stack_size_limits_for_multi_threaded_applications_on_aix?lang=en
msg348891 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-08-02 07:47
@Ronald - thanks.

Gives me something to work from. Would not have found this so easily!

But - where to put it... :)
msg348894 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-02 08:52
@Michael: The relevant code is in Python/thread_pthread.h:  https://github.com/python/cpython/blob/bf8162c8c45338470bbe487c8769bba20bde66c2/Python/thread_pthread.h#L34

There are already blocks for macOS and FreeBSD, adding a block for AIX should be easy enough. I don't know what a good size for the stack would be though, the macOS size is large enough to not cause crashes with the default recursion limit in debug builds and was experimentally determined.
msg348895 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-08-02 09:09
*** 
Looking in ./Python/thread_pthread.h"

  +252  #if defined(THREAD_STACK_SIZE)
  +253      PyThreadState *tstate = _PyThreadState_GET();
  +254      size_t stacksize = tstate ? tstate->interp->pythread_stacksize : 0;
  +255      tss = (stacksize != 0) ? stacksize : THREAD_STACK_SIZE;
  +256      if (tss != 0) {
  +257          if (pthread_attr_setstacksize(&attrs, tss) != 0) {
  +258              pthread_attr_destroy(&attrs);
  +259              return PYTHREAD_INVALID_THREAD_ID;
  +260          }
  +261      }
  +262  #endif

It appears asif the call needed (for AIX) - thread_attr_setstacksize(&attrs, tss) should be occurring.

Can you help me with a quick program that reports back the actual stack size an AIX thread has?

What may be at the heart of this (assuming the call above is working as expected) - the default memory size for AIX 32-bit is 256MB. Assuming 16 to 32MB is already in use - if I understand this "recurse" logic - after about 12-14 recursions the 256MB is consumed.

In short (work calls) - it seems the API mentioned is already in place and what is happening in terms of "exception catching" is different.

Looking forward to hints on how to dig further.
msg348896 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-02 09:48
That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.
msg348897 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-08-02 09:52
Going to take a stab in the dark - the the issue lies here:

"./Python/errors.c"
#ifndef Py_NORMALIZE_RECURSION_LIMIT
#define Py_NORMALIZE_RECURSION_LIMIT 32
#endif

As there is not enough memory for this to run in the default memory model.

However, 32 does not seem to be a counter limit:

With this "hack" I get success:

        script = """if True:
            import threading

            def recurse(loop=0):
                loop = loop+1
                if loop < 128+32+8+3:
                    return recurse(loop)
                else:
                    return

            def outer():
                try:
                    recurse()
                except RecursionError:
                    pass

            w = threading.Thread(target=outer)
            w.start()
            w.join()
            print('end of main thread')
            """

So, I hope this helps point at where we need to look to find why RecursionError is not being returned when (I assume) memory runs out.
msg348899 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-08-02 09:57
On 02/08/2019 11:48, Ronald Oussoren wrote:
> Ronald Oussoren <ronaldoussoren@mac.com> added the comment:
>
> That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.
Should have been watching mail - I'll look at this asap.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue18049>
> _______________________________________
>
msg348900 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-08-02 10:17
On 02/08/2019 11:57, Michael Felt wrote:
> On 02/08/2019 11:48, Ronald Oussoren wrote:
>> Ronald Oussoren <ronaldoussoren@mac.com> added the comment:
>>
>> That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.
> Should have been watching mail - I'll look at this asap.

Actually, it is defined for "most" platforms...

root@x066:[/data/prj/python/python3-3.9]make
xlc_r -c  -DNDEBUG -O -I/opt/include -O2 -qmaxmem=-1    
-I../git/python3-3.9/Include/internal -IObjects -IInclude -IPython -I.
-I../git/python3-3.9/Include -I/opt/include   -DPy_BUILD_CORE -o
Python/thread.o ../git/python3-3.9/Python/thread.c
"../git/python3-3.9/Python/thread_pthread.h", line 255.2: 1506-205 (S)
#error "Not expected"
make: *** [Makefile:1706: Python/thread.o] Error 1

At the top of the file:

/* The POSIX spec requires that use of pthread_attr_setstacksize
   be conditional on _POSIX_THREAD_ATTR_STACKSIZE being defined. */
#ifdef _POSIX_THREAD_ATTR_STACKSIZE
#ifndef THREAD_STACK_SIZE
#define THREAD_STACK_SIZE       0       /* use default stack size */
#endif

/* The default stack size for new threads on OSX and BSD is small enough
that
 * we'll get hard crashes instead of 'maximum recursion depth exceeded'
 * exceptions.
 *
 * The default stack sizes below are the empirically determined minimal
stack
 * sizes where a simple recursive function doesn't cause a hard crash.
 */

So, the block is executed - but the size is 0 aka default.

As a first attempt - I'll take the FreeBSD size going:

michael@x071:[/data/prj/python/git/python3-3.9]git diff
diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h
index 994e35b..83b7e77 100644
--- a/Python/thread_pthread.h
+++ b/Python/thread_pthread.h
@@ -47,6 +47,10 @@
 #undef  THREAD_STACK_SIZE
 #define THREAD_STACK_SIZE       0x400000
 #endif
+#if defined(_AIX) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0
+#undef  THREAD_STACK_SIZE
+#define THREAD_STACK_SIZE       0x400000
+#endif
 /* for safety, ensure a viable minimum stacksize */
 #define THREAD_STACK_MIN        0x8000  /* 32 KiB */
 #else  /* !_POSIX_THREAD_ATTR_STACKSIZE */

And with this - the test passes. :)
msg348945 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-08-03 06:12
New changeset 9670ce76b83bde950020f8d89c4d27168aaaf912 by Ronald Oussoren (Michael Felt) in branch 'master':
bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081)
https://github.com/python/cpython/commit/9670ce76b83bde950020f8d89c4d27168aaaf912
msg350083 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-21 13:06
The test does now crash on FreeBSD: see bpo-37906.
msg350718 - (view) Author: miss-islington (miss-islington) Date: 2019-08-29 05:35
New changeset f92bb6ed336c49cabf30717c3b741b1b4284f491 by Miss Islington (bot) in branch '3.8':
bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081)
https://github.com/python/cpython/commit/f92bb6ed336c49cabf30717c3b741b1b4284f491
msg351180 - (view) Author: Kyle Stanley (aeros) * (Python triager) Date: 2019-09-05 06:28
Ronald, is it feasible that the changes made in https://github.com/python/cpython/pull/14748/files to THREAD_STACK_SIZE in Python/thread_pthread.h could be causing intermittent failures for the Azure macOS PR tests?

In a recent PR (https://github.com/python/cpython/pull/15651), test_pending_calls_race (test.test_concurrent_futures.ThreadPoolWaitTests) seemed to be timing out. See the build log for more details:  https://dev.azure.com/Python/cpython/_build/results?buildId=49786&view=logs&j=18d1a34d-6940-5fc1-f55b-405e2fba32b1.

It doesn't appear to be causing consistent failures for Azure, but the latest commit to PR-15651 was only involving a formatting change to the news entry, so it should not have had any influence on the test that timed out. Prior to that commit, there were no CI failures.
msg351185 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-09-05 07:26
@aeros167 The build log failures due to timeout in the linked Azure pipeline failure seem to be similar to report at issue37245
msg351195 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-05 11:14
> Ronald, is it feasible that the changes made in https://github.com/python/cpython/pull/14748/files to THREAD_STACK_SIZE in Python/thread_pthread.h could be causing intermittent failures for the Azure macOS PR tests?

See also bpo-37245 ("Azure Pipeline 3.8 CI: multiple tests hung and timed out on macOS 10.13").
msg351256 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-09-06 14:30
I think it is unlikely that the timeouts on Azure CI are related to this change.
History
Date User Action Args
2019-09-06 14:30:57ronaldoussorensetmessages: + msg351256
2019-09-05 11:14:02vstinnersetmessages: + msg351195
2019-09-05 07:26:02xtreaksetnosy: + xtreak
messages: + msg351185
2019-09-05 06:28:56aerossetnosy: + aeros
messages: + msg351180
2019-08-29 05:35:44miss-islingtonsetmessages: + msg350718
2019-08-29 05:17:25miss-islingtonsetpull_requests: + pull_request15248
2019-08-21 13:06:38vstinnersetmessages: + msg350083
2019-08-03 07:20:18ronaldoussorensetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-08-03 07:18:56miss-islingtonsetpull_requests: + pull_request14834
2019-08-03 06:12:31ronaldoussorensetmessages: + msg348945
2019-08-02 17:01:39Michael.Feltsetpull_requests: + pull_request14827
2019-08-02 10:17:47Michael.Feltsetmessages: + msg348900
2019-08-02 09:57:34Michael.Feltsetmessages: + msg348899
2019-08-02 09:52:26Michael.Feltsetmessages: + msg348897
2019-08-02 09:48:23ronaldoussorensetmessages: + msg348896
2019-08-02 09:09:46Michael.Feltsetmessages: + msg348895
2019-08-02 08:52:26ronaldoussorensetmessages: + msg348894
2019-08-02 07:47:32Michael.Feltsetnosy: + Michael.Felt
messages: + msg348891
2019-08-02 05:51:54ronaldoussorensetmessages: + msg348887
2019-08-02 05:36:39ronaldoussorensetpull_requests: + pull_request14822
2019-08-01 20:08:21ronaldoussorensetmessages: + msg348876
2019-08-01 20:00:20ronaldoussorensetmessages: + msg348875
2019-08-01 14:39:00miss-islingtonsetnosy: + miss-islington
messages: + msg348860
2019-08-01 09:16:44vstinnersetnosy: + vstinner
messages: + msg348847
2019-08-01 05:43:22miss-islingtonsetpull_requests: + pull_request14814
2019-08-01 05:43:10ronaldoussorensetmessages: + msg348839
2019-07-14 09:27:43ronaldoussorensetmessages: + msg347901
2019-07-14 08:38:58ronaldoussorensetmessages: + msg347888
2019-07-14 07:44:03steve.dowersetnosy: + steve.dower
messages: + msg347868
2019-07-13 14:38:17ronaldoussorensettitle: Re-enable threading test on OSX -> Re-enable threading test on macOS
messages: + msg347830
versions: + Python 3.8, Python 3.9, - Python 3.4
2019-07-13 14:34:45ronaldoussorensetpull_requests: + pull_request14543
2019-07-13 14:13:58ronaldoussorenlinkissue34264 superseder
2018-08-19 15:15:32ronaldoussorensetmessages: + msg323757
2016-08-07 09:38:51matrixisesetmessages: + msg272116
2016-07-23 10:56:32ronaldoussorensetmessages: + msg271071
2016-07-18 15:50:52matrixisesetnosy: + matrixise
messages: + msg270766
2013-07-06 08:39:00ronaldoussorensetkeywords: + patch, needs review

messages: + msg192389
versions: + Python 3.4
2013-05-24 14:27:46ronaldoussorencreate