classification
Title: Make thread stack size runtime tunable
Type: Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: aimacintyre Nosy List: aimacintyre, hyeshik.chang, loewis, sf-robot, tim.peters
Priority: normal Keywords: patch

Created on 2006-03-20 12:37 by aimacintyre, last changed 2006-06-29 02:29 by sf-robot. This issue is now closed.

Files
File name Uploaded Description Edit
tunable_thread_stack_size.patch aimacintyre, 2006-03-20 12:37 tunable thread stack size implementation
tunable_thread_stack_size.patch.v2 aimacintyre, 2006-04-14 12:51 tunable thread stack size implementation, v2
tunable_thread_stack_size.patch.v3 aimacintyre, 2006-05-25 14:47 tunable thread stack size implementation, v3
Messages (13)
msg49767 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-03-20 12:37
Platform default thread stack sizes vary considerably.
Some are very generous (Win32: usually 1MB; Linux: 1MB,
sometimes 8MB).  Others are not (FreeBSD: 64k).

Some platforms have restricted virtual address space
OS/2: 512M less overhead) which makes hard coding a
generous default thread stack size problematic.  Some
platforms thread commit stack address space, even
though the memory backing it may not be committed
(Windows, OS/2 at least).

Some applications have a thirst for stack space in
threads (Zope). Some programmers want to be able to use
lots of threads, even in the face of sound advice about
the lack of wisdom in this approach.

The current approach to stack space management in
threads in Python uses a hard coded strategy, relying
on the platform having a useful default or relying on
the system administrator or distribution builder
over-riding the default at compile time.

This patch is intended to allow developers some control
over managing this resource from within Python code by
way of a function in the thread module.  As written, it
is not intended to provide unlimited flexibility; that
would probably require exposing the underlying
mechanism as an option on the creation of each thread.

An alternative approach to providing the functionality
would be to use an environment variable to provide the
information to the thread module.  This has its pros
and cons, in terms of flexibility and ease of use, and
could be complementary to the approach implemented.

The patch has been tested on OS/2 and FreeBSD 4.8.  I
have no means of testing the code on Win32 or Linux,
though Linux is a pthread environment as is FreeBSD. 
Code base is SVN head from a few hours ago. A doc 
update is included.

While I would like to see this functionality in Python
2.5, it is not a critical issue.

Critique of the approach and implementation welcome. 
Something not addressed is the issue of tests,
primarily because I haven't been able to think of a
viable testing strategy - I'm all ears to suggestions
for this.
msg49768 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2006-03-20 13:58
Logged In: YES 
user_id=55188

I'm all for this!  The FreeBSD port have maintained a local
patch to bump THREAD_STACK_SIZE.  The patch will lighten
FreeBSD users' burden around thread stack size.

BTW, the naming, "thread.stack_size" seems to miss a verb
while all the other functions on the thread module have it.
 How about set_stack_size() or set_stacksize()?  Or, how
about in sys module?
msg49769 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-03-22 08:28
Logged In: YES 
user_id=250749

Thanks for the comments.

As implemented, the function is both a getter and
(optionally) a setter which makes attempting to use a
"get"/"set" prefix 
awkward.

I chose this approach to make it a little simpler to support
temporary changes.  I did consider using a module
attribute/variable, but it is slightly more unwieldy for
this case:

old_size = thread.stack_size(new_size)
...
thread.stack_size(old_size)

vs

old_size = thread.stack_size
thread.stack_size = new_size
...
thread.stack_size = old_size

or (using get/set accessors)

old_size = thread.get_stacksize()
thread.set_stacksize(new_size)
...
thread.set_stacksize(old_size)

I think an argument can be made for passing on the
"get"/"set" naming consistency based on the guidelines in
PEP 8.  While I have a preference for what I've implemented,
I'm more interested in getting the functionality in than
debating its decor.  If there's a strong view about these 
issues, I'm prepared to revise the patch accordingly.

I don't believe that the functionality belongs anywhere else
than the thread module, except possibly shadowing it in the
threading module, as it is highly specific to thread
support.  The sys module seems more appropriate for general 
knobs, and only for specific knobs when there is no other
choice IMO.  Doing it outside the thread module also
complicates the implementation, which I was trying to keep
as simple as I could.
msg49770 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2006-04-10 13:41
Logged In: YES 
user_id=21627

Usage of pthread_attr_setstacksize should be conditional on
the definition of _POSIX_THREAD_ATTR_STACKSIZE, according to
POSIX. Errors from pthread_attr_setstacksize should be
reported (POSIX lists EINVAL as a possible error).

I think PTHREAD_STACK_MIN should be considered. 

The documentation should list availibility of the feature,
currently Win32, OS/2, and POSIX threads (with the TSS
option, to be precise). If some platforms have specific
additional requirements on the possible values (eg. must be
a multiple of the page size), these should be documented, as
well.

Apart from that, the patch looks fine.
msg49771 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-04-10 14:45
Logged In: YES 
user_id=250749

1) wrt _POSIX_THREAD_ATTR_STACKSIZE, I'll look at that
(though I note its absence from the existing code...)

2) PTHREAD_STACK_MIN on FreeBSD is 1k, which seemed grossly
inadequate for Python (my impression is that 20-32k is a
fairly safe minimum for Python).  In principle I don't have
a problem 
with relying on PTHREAD_STACK_MIN, except for trying to play
it safe.  Any further thoughts on this?

I'm also putting together an environment variable only
version of the patch, with a view to getting that in first,
and reworking this patch to work on top of that.

Thanks for the comments.
msg49772 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2006-04-10 15:09
Logged In: YES 
user_id=21627

re 1) Currently, the usage of the stacksize attribute is
depending on the definition of a THREAD_STACK_SIZE macro. I
don't know where that comes from, but I guess whoever
defines it knows what he is doing, so that the stacksize
attribute is defined on such a system.

re 2) I can accept that Python enforces a minimum above
PTHREAD_STACK_MIN; it shouldn't be possible to set the stack
size below PTHREAD_STACK_MIN, since that *will* fail when a
thread is created.

-1 for an environment variable version. What problem would
that solve? If this patch gets implemented, applications can
define their own environment variables if they think it
helps, and users/admins can put something in
sitecustomize.py if they think there should be an
environment variable controlling the stack size for all
Python applications on the system.
msg49773 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-04-14 12:51
Logged In: YES 
user_id=250749

I have updated the patch along the lines Martin suggested.

I have omitted OS/2 from the list of supported platforms in
the doc patch as I haven't added OS/2 to anywhere else in
the docs.  My thinging has been that OS/2 is a 2nd tier
platform, and I have kept an extensive port README file in
the build directory (PC/os2emx) documenting port specific
behaviour.

The idea with the environment variable version was that it
would be less "intrusive" a change from the user POV.
msg49774 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-04-22 05:46
Logged In: YES 
user_id=31435

The patch applies cleanly on WinXP, "and works" (I checked
this by setting various stack sizes, spawning a thread doing
nothing but a raw_input(), and looking at the VM size under
Task Manager while the thread was paused waiting for input
-- the VM size went up each time roughly by the stack-size
increase; finally set stack_size to 0 again, and all the
"extra" VM went away).

Note that Python C style for defining functions puts the
function name in the first column.  For example,

"""
static int
_pythread_nt_set_stacksize(size_t size)
"""

instead of

"""
static int _pythread_nt_set_stacksize(size_t size)
"""

The patch isn't consistent about this, and perhaps it's
errenously ;-) aping bad style in surrounding function
definitions.

This should really be exposed via threading.py.  `thread` is
increasingly "just an implementation detail" of `threading`,
and it actually felt weird to me to write a test program
that had to import `thread`.
msg49775 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-04-23 05:35
Logged In: YES 
user_id=250749

Thanks Tim.

My default action is to try and match the prevailing style,
but  cut'n'paste propagated the flaw.  thread_pthread.h was
clean AFAICS, so I'll do a style normalisation (as a
separate checkin) on thread_nt.py and thread_os2.h when
commit time comes.

As an "implementation detail", I hadn't considered that
exposing it via threading was appropriate.

I can see 2 approaches:
- a simple shadow of the function as a module level function;
or
- a classmethod of the Thread class.

Any hints on which would be the more preferable or natural
approach?
msg49776 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2006-04-23 20:25
Logged In: YES 
user_id=31435

Right, this one: "a simple shadow of the function as a
module level function".  If it affects all threads (which it
does), then a module function is a natural place for it.  If
I a saw a method on the Thread class, the most natural (to
me ;-)) assumption is that a_thread.stack_size(N) would set
the stack size for the specific thread `a_thread`, but not
affect other threads.  Part of what makes that "the most
natural" assumption is that Thread has no class or static
methods today.  As a module-level function, no such
confusion is sanely possible.

Sticking "stack_size" in threading.__all__, and adding

from thread import stack_size

to threading.py is all I'm looking for here.  Well, plus
docs and a test case ;-)
msg49777 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-05-25 14:47
Logged In: YES 
user_id=250749

Ok, v3 includes the additions to the threading module, tests
in both test_thread and test_threading and docs in both
thread and threading modules (duplicated as I don't know how
to do the LaTex linking).

If there are no other issues needing to be addressed, I
propose to check these changes in sometime on the weekend of
June 3-4 or thereabouts to get in a bit before the beta release.
msg49778 - (view) Author: Andrew I MacIntyre (aimacintyre) * (Python triager) Date: 2006-06-14 02:05
Logged In: YES 
user_id=250749

Checked in to trunk after further revision in r46919, with
minor test tweaks in r46925 & r46929.
msg49779 - (view) Author: SourceForge Robot (sf-robot) Date: 2006-06-29 02:29
Logged In: YES 
user_id=1312539

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).
History
Date User Action Args
2006-03-20 12:37:35aimacintyrecreate