New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement PyOS_CheckStack on macOS using pthread_get_stack*_np #78136
Comments
The (non-portable) pthread APIs "pthread_get_stacksize_np()" and "pthread_get_stackaddr_np()" can be used to implement PyOS_CheckStack on macOS, which would give nicer behavior than crashing when the recursion limit is too high for the stack size of the current thread. Creating a patch should be fairly easy, but does require C knowledge. |
Hi, Ronald Can I take a look this issue? |
Sure. I'm not working on this at the moment. |
I wrote a first proto type PR and it works well on my MacBook Pro. I will take a look more deeply at this issue. Thanks, |
Have you looked at benchmark results for this patch? In particular, PyOS_CheckStack is called quite often and I wonder how those calls affect performance. If there is a clear performance impact it would be useful to store some information in the python thread state. In what versions of macOS are there problems with pthread_get_stacksize_np() on the main thread? |
Thanks! |
W.r.t. benchmarks: The full benchmark suite that's often used to assess the performance impact on real world code is: http://pyperformance.readthedocs.io Running this takes some time, and the interesting bit is comparing the performance with and without your patch. You could also test with a small script that uses the timeit module to run some code that performs a lot of function calls. |
I updated the PR and we need some discussion. baseline(master): 21.330535484 So my choice is just believe pthread_get_stacksize_np() like rust do. (https://github.com/rust-lang/rust/blob/master/src/libstd/sys/unix/thread.rs#L246) code: def fib(n):
if n == 0:
return 0
if n == 1:
return 1
return fib(n-2) + fib(n-1)
if __name__ == '__main__':
import timeit
print(timeit.timeit("fib(10)", setup="from __main__ import fib")) |
I updated the PR Please take a look. |
benchmark result: 22.639114022999998 secs master (+0%) def fib(n):
if n == 0:
return 0
if n == 1:
return 1
return fib(n-2) + fib(n-1)
if __name__ == '__main__':
import timeit
print(timeit.timeit("fib(10)", setup="from __main__ import fib")) |
Sorry about the slow response. The patch looks OK, although I'm entirely happy yet about the workaround for the stack size on the main thread. Primarily because framework builds use a non-standard stack size as the default one is too small for the recursion limit (IIRC especially with debug builds). BTW. My own use case for this feature is primarily the use of python on threads created by the system (such as threads created by Cocoa APIs and libdispatch), for threads created in Python we can (and do) create them with a large enough stack (although this guard is helpful there too). My next step will be to test the patch locally. This might take a while because I will be traveling the rest of this month. Worst case is that I'll report back during EuroPython 2018 (starting at 23th of July) |
FYI, I've updated the patch to use rlim_cur() on the main thread when pthread_get_stacksize_np() is not accurate. This way works pretty well on my local machine. |
It seems that this issue is a duplicate of bpo-25518. |
@ronaldoussoren |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: