classification
Title: Crash in PyObject_Malloc
Type: crash Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Rhamphoryncus, fdirosa, grahamd, nnorwitz, terry.reedy, timbishop, vslavik
Priority: normal Keywords:

Created on 2007-07-21 17:12 by timbishop, last changed 2010-08-04 01:15 by terry.reedy. This issue is now closed.

Files
File name Uploaded Description Edit
gdb.out timbishop, 2007-07-21 17:12 GDB Output
valgrind.log.gz vslavik, 2008-05-21 17:53 Valgrind log
Messages (33)
msg32533 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-21 17:12
I'm running the following on Solaris 9 SPARC:

python 2.5
apache 2.2
mod_python 3.3.1
subversion 1.4.4
trac 0.11dev

Trac is a web application that's written in python and is running through apache using mod_python. It also uses the subversion python libraries.

After an undetermined amount of clicks (usually in the order of a minute or two of randomly clicking around) the apache child process dies:

[Sat Jul 21 17:47:27 2007] [error] [client myip] mod_python (pid=15138, interpreter='my.site.com', phase='PythonHandler', handler='trac.web.modpython_frontend'): Application error, referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] ServerName: 'my.site.com', referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] DocumentRoot: '/path/to/docroot', referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] URI: '/trac/', referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] Location: '/trac', referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] Directory: None, referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] Filename: '/path/to/docroot', referer: http://my.site.com/
[Sat Jul 21 17:47:27 2007] [error] [client myip] PathInfo: '/trac/', referer: http://my.site.com/

It's dumped a core file. Examining that with gdb shows a Bus error here:

Core was generated by `/usr/local/sbin/httpd -DLocalConfig -k start'.
Program terminated with signal 10, Bus error.
#0  PyObject_Malloc (nbytes=16) at Objects/obmalloc.c:747
747                             if ((pool->freeblock = *(block **)bp) != NULL) {
(gdb) l
742                              * Pick up the head block of its free list.
743                              */
744                             ++pool->ref.count;
745                             bp = pool->freeblock;
746                             assert(bp != NULL);
747                             if ((pool->freeblock = *(block **)bp) != NULL) {
748                                     UNLOCK();
749                                     return (void *)bp;
750                             }
751                             /*
(gdb) 

Full gdb output is attached.

I've tried disabling pymalloc when building python, but the problem just moves elsewhere. However, with pymalloc enabled it's consistently on this line.

Do you have advice on how to debug this further?
msg32534 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2007-07-21 21:47
Firstly, take a deep breath.  This probably isn't going to be easy.  This seems like memory corruption, possibly due to threads.  That's a guess, since I don't know if you are running with threads.

The first (probably faster alternative) is to build python and all the C extension modules --with-pydebug (passed when you run ./configure).  That might catch the error slightly earlier.  It won't necessarily point to the cause of the problem, but may help narrow down the potential causes.

The second alternative is to try running this under valgrind or purify or with some memory debugger.  It seems like there is memory corruption (I'm guessing from mod_python).  This will be more robust at finding the problem nearer to the root cause.  Unfortunately, it might prove too slow to be useful or find the problem.

If you can narrow down the test case that would help a lot.  For example, could you eliminate subversion (presumably using the C extension module) or mod_python?  If you could eliminate one and remove the problem, that would help a lot.
msg32535 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-23 12:46
Ok, I was guessing it was memory corruption, or something else nasty.

I've managed to eliminate subversion, so this just leaves apache, mod_python and python. I've rebuild with --with-pydebug and it's made the crash happen earlier. It appears to die now in some checking code that only happens when pydebug is used:

Program terminated with signal 6, Aborted.
#0  0xfefa0218 in _lwp_kill () from /usr/lib/libc.so.1
(gdb) bt
#0  0xfefa0218 in _lwp_kill () from /usr/lib/libc.so.1
#1  0xfef50c88 in raise () from /usr/lib/libc.so.1
#2  0xfef36e60 in abort () from /usr/lib/libc.so.1
#3  0xfe2af8d8 in Py_FatalError (msg=0xfe309618 "Invalid thread state for this thread") at Python/pythonrun.c:1559
#4  0xfe2a8990 in PyThreadState_Swap (newts=0x1e8b78) at Python/pystate.c:320
#5  0xfe2593b0 in PyEval_AcquireThread (tstate=0x1e8b78) at Python/ceval.c:252
#6  0xfe644b78 in get_interpreter () from /usr/local/libexec/apache2/mod_python.so
#7  0xfe648f28 in python_cleanup_handler () from /usr/local/libexec/apache2/mod_python.so
#8  0xff2a3e48 in run_cleanups () from /usr/local/lib/libapr-1.so.0
#9  0xff2a4674 in apr_pool_destroy () from /usr/local/lib/libapr-1.so.0
#10 0x0004581c in ap_process_http_connection ()
#11 0x000415e4 in ap_run_process_connection ()
#12 0x0004c7f8 in child_main ()
#13 0x0004cb18 in make_child ()
#14 0x0004cc14 in startup_children ()
#15 0x0004d778 in ap_mpm_run ()
#16 0x00027110 in main ()
(gdb) f 5
#5  0xfe2593b0 in PyEval_AcquireThread (tstate=0x1e8b78) at Python/ceval.c:252
252             if (PyThreadState_Swap(tstate) != NULL)
(gdb) l
247             if (tstate == NULL)
248                     Py_FatalError("PyEval_AcquireThread: NULL new thread state");
249             /* Check someone has called PyEval_InitThreads() to create the lock */
250             assert(interpreter_lock);
251             PyThread_acquire_lock(interpreter_lock, 1);
252             if (PyThreadState_Swap(tstate) != NULL)
253                     Py_FatalError(
254                             "PyEval_AcquireThread: non-NULL old thread state");
255     }
256     
(gdb) f 4
#4  0xfe2a8990 in PyThreadState_Swap (newts=0x1e8b78) at Python/pystate.c:320
320                             Py_FatalError("Invalid thread state for this thread");
(gdb) l
315                        to it, we need to ensure errno doesn't change.
316                     */
317                     int err = errno;
318                     PyThreadState *check = PyGILState_GetThisThreadState();
319                     if (check && check->interp == newts->interp && check != newts)
320                             Py_FatalError("Invalid thread state for this thread");
321                     errno = err;
322             }
323     #endif
324             return oldts;
(gdb) f 3
#3  0xfe2af8d8 in Py_FatalError (msg=0xfe309618 "Invalid thread state for this thread") at Python/pythonrun.c:1559
1559            abort();
(gdb) l
1554            OutputDebugString("\n");
1555    #ifdef _DEBUG
1556            DebugBreak();
1557    #endif
1558    #endif /* MS_WINDOWS */
1559            abort();
1560    }
1561    
1562    /* Clean up and exit */
1563    
(gdb) 

So your guess about threads might well be right. I am compiling with --with-threads. I guess the next step would be to disable threads?

Tim.
msg32536 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-23 13:09
Further info about what's causing the "Invalid thread state for this thread".

#0  0xfefa0218 in _lwp_kill () from /usr/lib/libc.so.1
(gdb) f 4
#4  0xfe2a8990 in PyThreadState_Swap (newts=0x296c58) at Python/pystate.c:320
320                             Py_FatalError("Invalid thread state for this thread");
(gdb) l
315                        to it, we need to ensure errno doesn't change.
316                     */
317                     int err = errno;
318                     PyThreadState *check = PyGILState_GetThisThreadState();
319                     if (check && check->interp == newts->interp && check != newts)
320                             Py_FatalError("Invalid thread state for this thread");
321                     errno = err;
322             }
323     #endif
324             return oldts;
(gdb) p check
$1 = (PyThreadState *) 0x19c880
(gdb) p *check
$2 = {next = 0x0, interp = 0x199f90, frame = 0x0, recursion_depth = 0, tracing = 0, use_tracing = 0, c_profilefunc = 0, c_tracefunc = 0, 
  c_profileobj = 0x0, c_traceobj = 0x0, curexc_type = 0x0, curexc_value = 0x0, curexc_traceback = 0x0, exc_type = 0xfe3457c4, 
  exc_value = 0x0, exc_traceback = 0x0, dict = 0x0, tick_counter = 40, gilstate_counter = 1, async_exc = 0x0, thread_id = 1}
(gdb) p newts
$3 = (PyThreadState *) 0x296c58
(gdb) p *newts
$4 = {next = 0x19c880, interp = 0x199f90, frame = 0x0, recursion_depth = 0, tracing = 0, use_tracing = 0, c_profilefunc = 0, 
  c_tracefunc = 0, c_profileobj = 0x0, c_traceobj = 0x0, curexc_type = 0x0, curexc_value = 0x0, curexc_traceback = 0x0, exc_type = 0x0, 
  exc_value = 0x0, exc_traceback = 0x0, dict = 0x0, tick_counter = 0, gilstate_counter = 1, async_exc = 0x0, thread_id = 1}
(gdb) p oldts
$5 = (PyThreadState *) 0x0
(gdb) 
msg32537 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-23 17:21
And another followup. It fails with the below error on the most basic mod_python test cases:

http://www.modpython.org/live/current/doc-html/inst-testing.html
http://www.modpython.org/live/current/doc-html/inst-trouble.html

I've also removed a large proportion of apache modules.

So it seems there's very little left externally that could be causing this.
msg32538 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2007-07-24 05:29
This gives me an idea.  Are you using threads and generators, perhaps?  Could this be related to Bug #1579370?  http://python.org/sf/1579370  You can find a short discussion on it from python-dev around Jan 22-23.  Martin checked in rev 53531 on trunk.  This change might be in 2.5.1, I don't remember when it came out.  Try rebuilding Python with this patch installed and see if it fixes your problem.

msg32539 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-24 13:01
Hi,

That patch is included in 2.5.1, so I don't think it's that causing the problem.

Tim.
msg32540 - (view) Author: Tim Bishop (timbishop) Date: 2007-07-25 10:25
I'm wondering if this Abort is a red herring. It's caused by the following section in PyThreadState_Swap in Python/pystate.c:

#if defined(Py_DEBUG) && defined(WITH_THREAD)
        if (newts) {
                /* This can be called from PyEval_RestoreThread(). Similar
                   to it, we need to ensure errno doesn't change.
                */
                int err = errno;
                PyThreadState *check = PyGILState_GetThisThreadState();
                if (check && check->interp == newts->interp && check != newts)
                        Py_FatalError("Invalid thread state for this thread");
                errno = err;
        }
#endif  

Specifically this test is true:

check->interp == newts->interp

I'm not convinced if this is right, and a friend who's a bit more clued up than me isn't sure either.

Could someone look at it to check?
msg67163 - (view) Author: Vaclav Slavik (vslavik) * Date: 2008-05-21 17:53
I'm able to reliably reproduce this bug (using Apache 2.2.8, otherwise
same as above), although not with mod_python's simple tests, but only
with Trac (apparently, Trac creates some threads while processing the
request).

How to reproduce: configure two Trac/mod_python locations in Apache
config and set them to use different Python interpreters:

    <Location /trac1>
       SetHandler mod_python
       PythonHandler trac.web.modpython_frontend 
       PythonOption TracEnv /srv/bakefile/trac
       PythonOption TracUriRoot /
       PythonInterpreter trac1
    </Location>
    <Location /trac2>
       SetHandler mod_python
       PythonHandler trac.web.modpython_frontend 
       PythonOption TracEnv /srv/bakefile/trac
       PythonOption TracUriRoot /
       PythonInterpreter trac2
    </Location>

(As far as this bug is concerted, this is the same as having two virtual
hosts that both run Trac -- mod_python's default interpreter has the
same name as the (virtual) host.)

Then run Apache as "apache2 -X" to ensure that requests are handled by
single handler serially and do

  $ curl http://your-server/trac1/wiki
  $ curl http://your-server/trac2/wiki

The second command crashes Apache.

If you change mod_python configuration to use the same interpreter names
for both Trac instances, the crash doesn't happen (but of course, that
prevents you from using different versions of Python modules in both
vhosts).

I'm attaching Valgrind log, but it's not very useful -- it's not deep
enough and my server doesn't have enough memory for high enough value of
--num-callers.
msg67459 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-05-28 16:55
The documentation states that thread states are supported within a 
single interpreter and not supported across other interpreters 
(specifically for the GIL functions which are just wrapper functions 
around the  PyEval_ functions).

So I would have to conclude then that the condition should check to see 
if the swapping thread is within the current interpreter state 
otherwise "fatal error", as such...

The condition: check->interp == newts->interp

should be: check->interp != newts->interp

In otherwords if there is a previous thread state and it's interpreter 
is NOT the same as the one being swapped in then do the fatal error.

Just my opinion.  I ran into this problem when using the 
PyThreadState_Swap function directly (low level) to do the thread 
handling within a single interpreter state (Debug mode only).
msg67677 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-06-03 22:44
Does the PythonInterpreter option create multiple interpreters within a
single process, rather than spawning separate processes?

IMO, that API should be ripped out.  They aren't truly isolated
interpreters and nobody I've asked has yet provided a use case for it.
msg67678 - (view) Author: Vaclav Slavik (vslavik) * Date: 2008-06-03 22:58
> Does the PythonInterpreter option create multiple interpreters
> within a single process

Yes.

> They aren't truly isolated interpreters and nobody I've asked has yet 
> provided a use case for it.

If you ignore mod_python and mod_wsgi, then maybe, but mod_python is
*the* use case for this. Running separate process for every web app
and/or every virtual host on your server is expensive in terms of RAM
usage (and this matters if you use virtual server - 256MB or less is not
unusual). On the other hand, you need isolation for independent apps --
some modules may use globals, or you may want to be able to run both
production and testing versions of the same app (i.e. different versions
of the same Python module).
msg67679 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-06-03 23:58
Right, so it's only the python modules loaded as part of the app that
need to be isolated.  You don't need the stdlib or any other part of the
interpreter to be isolated.

This could be done either by not using the normal import mechanism
(build your own on top of exec()) or by some magic to generate a
different root package for each "interpreter" (so you show up in
sys.modules as '_mypkg183.somemodule'.)
msg67685 - (view) Author: Vaclav Slavik (vslavik) * Date: 2008-06-04 06:59
> This could be done either by not using the normal import mechanism

This is completely unrealistic suggestion, people use libraries and
frameworks in their code, you're in effect suggestion that no library
that could possibly be used in webapp should use (standard) import.

I may be wrong, but I strongly suggest that you do talk to
mod_python/wsgi people (who know much better than me) about what
real-life uses of isolated interpreters are, before entertaining the
idea of getting rid of the feature.
msg69439 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-08 19:56
Apparently modwsgi uses subinterpreters because some third-party
packages aren't sufficiently thread-safe - modwsgi can't fix those
packages, so subinterpreters are the next best thing.

http://groups.google.com/group/modwsgi/browse_frm/thread/988bf560a1ae8147/2f97271930870989

This is a weak argument for language design.  Subinterpreters should be
deprecated, the problems with third-party packages found and fixed, and
ultimately subinterpreters ripped out.

If you wish to improve the situation, I suggest you help fix the
problems in the third-party packages.  For example,
http://code.google.com/p/modwsgi/wiki/IntegrationWithTrac implies trac
is configured with environment variables - clearly not thread-safe.
msg69440 - (view) Author: Vaclav Slavik (vslavik) * Date: 2008-07-08 20:11
I'm sorry, did you actually read my comments? Once again, this has
nothing to do with threads and everything to do with isolation of
independent Python apps running in the same *process*. Hope it got
through this time :-/
msg69442 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-08 21:10
Ahh, I did miss that bit, but it doesn't really matter.

Tell modwsgi to only use the main interpreter ("PythonInterpreter
main_interpreter"), and if you want multiple modules of the same name
put them in different packages.  Any other problems (trac using env vars
for configuration) should be fixed directly.

(My previous comment about building your own import mechanism was
overkill.  Writing a package that uses relative imports is enough - in
fact, that's what relative imports are for.)
msg69444 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-08 22:04
I believe PyThreadState_Swap function in ceval.c has a bug as I stated 
earlier.  However, I have not seen it included in the latest patches so 
now I wonder...

The following line in PyThreadState_Swap...
if (check && check->interp == newts->interp && check != newts)

should read as follows...
if (check && check->interp != newts->interp && check != newts)

since this condition, if true, raises an error.  Why should it raise an 
error if all the interpreters are equal across multiple thread states?  
If we have one interpreter with multiple thread states (i.e. multi-
threaded application) this function will error when switching between 
the thread states within the same interpreter (in DEBUG compile mode 
only since this code is commented out otherwise).  In the forums it 
describes the use of thread states to handle multiple python threads 
running simultaneously and not by using multiple interpreters but only 
one (the main interpreter).

Also the interpreters have be equal because in the documentation for 
the GIL functions it says it doesn't support multiple interpreters.  I 
think this is a typo/bug in the code.
msg69445 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-08 22:25
Franco, you need to look at the line above that check:

	PyThreadState *check = PyGILState_GetThisThreadState();
	if (check && check->interp == newts->interp && check != newts)
		Py_FatalError("Invalid thread state for this thread");

PyGILState_GetThisThreadState returns the original tstate *for that
thread*.  What it's asserting is that, if there's a second tstate *in
that thread*, it must be in a different subinterpreter.

It doesn't prevent your second and third tstate from sharing the same
subinterpreter, but it probably should, as this check implies it's an
invariant.
msg69457 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-09 04:21
Thanks Adam

but....

I'm still confused because...

There is a new rule in version 2.3.5.  Which is one interpreter with many 
thread states are supported for the GIL functions.  So this code breaks that 
rule since this if statement is checking if the interpreters are different 
for the current GIL state and the new ts which it can't be (i.e. 
unsupported).   See this email that points to the python documentation for 
2.3.5 regarding this "new" rule...

http://mail.python.org/pipermail/python-dev/2005-May/053840.html

Here is the extract of the email pertaining to this issue...

The documentation (http://docs.python.org/api/threads.html) states
"Note that the PyGILState_*() functions assume there is only one
global interpreter (created automatically by Py_Initialize()). Python
still supports the creation of additional interpreters (using
Py_NewInterpreter()), but mixing multiple interpreters and the
PyGILState_*() API is unsupported. ", so it looks like that using the
PyGilState_XXX functions in the core threadmodule.c means the
Py_NewInterpreter() call (i.e. multiple interpreters) is no longer
supported when threads are involved.

So regardless if we use the GIL functions or the lower level functions it 
all eventually boils down to this Swap function which has this condition 
that doesn't match what the documentation is stating. So which way is it? 
Can't have it both ways.

It seems since 2.3.5 they don't want you to use multiple interpreters is my 
guess when threading is involved.

- Franco

----- Original Message ----- 
From: "Adam Olsen" <report@bugs.python.org>
To: <fdirosa@stny.rr.com>
Sent: Tuesday, July 08, 2008 6:25 PM
Subject: [issue1758146] Crash in PyObject_Malloc

Adam Olsen <rhamph@gmail.com> added the comment:

Franco, you need to look at the line above that check:

PyThreadState *check = PyGILState_GetThisThreadState();
if (check && check->interp == newts->interp && check != newts)
Py_FatalError("Invalid thread state for this thread");

PyGILState_GetThisThreadState returns the original tstate *for that
thread*.  What it's asserting is that, if there's a second tstate *in
that thread*, it must be in a different subinterpreter.

It doesn't prevent your second and third tstate from sharing the same
subinterpreter, but it probably should, as this check implies it's an
invariant.

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue1758146>
_______________________________________
msg69458 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-09 04:23
Thanks Adam

but....

I'm still confused because...

There is a new rule in version 2.3.5.  Which is one interpreter with 
many thread states are supported for the GIL functions.  So this code 
breaks that rule since this if statement is checking if the 
interpreters are different for the current GIL state and the new ts 
which it can't be (i.e. unsupported).   See this email that points to 
the python documentation for 2.3.5 regarding this "new" rule...

http://mail.python.org/pipermail/python-dev/2005-May/053840.html

Here is the extract of the email pertaining to this issue...

The documentation (http://docs.python.org/api/threads.html) states
"Note that the PyGILState_*() functions assume there is only one
global interpreter (created automatically by Py_Initialize()). Python
still supports the creation of additional interpreters (using
Py_NewInterpreter()), but mixing multiple interpreters and the
PyGILState_*() API is unsupported. ", so it looks like that using the
PyGilState_XXX functions in the core threadmodule.c means the
Py_NewInterpreter() call (i.e. multiple interpreters) is no longer
supported when threads are involved.

So regardless if we use the GIL functions or the lower level functions 
it all eventually boils down to this Swap function which has this 
condition that doesn't match what the documentation is stating. So 
which way is it? Can't have it both ways.

It seems since 2.3.5 they don't want you to use multiple interpreters 
is my guess when threading is involved.
msg69459 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-09 05:26
It's only checking that the original tstate *for the current thread* and
the new tstate have a different subinterpreter.  A subinterpreter can
have multiple tstates, so long as they're all in different threads.

The documentation is referring specifically to the PyGILState_Ensure and
PyGILState_Release functions.  Calling these says "I want a tstate, and
I don't know if I had one already".  The problem is that, with
subinterpreters, you may not get a tstate with the subinterpreter you
want.  subinterpreter references saved in globals may lead to obscure
crashes or other errors - some of these have been fixed over the years,
but I doubt they all have.
msg69471 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-09 14:13
OK,

I think I found my problem.  I was using the main interpreter state 
(the one created by Py_Initialize) to create new thread states with.  
It seems that this interpreter state is reserved for GIL functions so 
one will need to create a new interpreter state with 
PyInterpeterState_New and use that interpreter state when creating the 
cooperating threads.  So now this check makes sense.  If one is 
swapping in a ts belonging to the main interpreter state, it best be 
the GIL thread state.
msg70109 - (view) Author: Graham Dumpleton (grahamd) Date: 2008-07-21 14:03
I know the discussions more or less says this, but I want to add some 
additional information.

For the record, the reason that mod_python crashes with 'Invalid thread state 
for this thread' when Py_DEBUG is defined in part relates to:

  http://issues.apache.org/jira/browse/MODPYTHON-217

Also, that Py_DEBUG check effectively says that if you use simplified GIL API 
for a particular thread against the first interpreter, you are prohibited from 
creating additional thread states for that thread. I haven't checked the 
documentation lately, but I am not sure it is really clear on that specific 
point and so in some respects the documentation may be at fault here. Someone 
might like to point to exact part of documentation which states this 
requirement.

The problem thus is that code which worked prior to Python 2.3 would still work 
with Python 2.3 and later, up to the point that some code decided to use the 
simplified GIL API. At that point Python would create its own internal thread 
state for that thread even if user code had already created one. Conversely, if 
the simplified GIL API was used against the thread first and then user code 
tried to create an additional thread state for that thread against first 
interpreter.

With Py_DEBUG defined, this scenario causes the assertion failure and the above 
error. Without Py_DEBUG defined, the code can quite happily run fine, at least 
until the point where code which left Python using a user thread state object 
attempts to reenter Python by using simplified GIL API. At that point it would 
deadlock.

Now, as I said, that one was effectively forced to use simplified GIL API for 
first interpreter with Python 2.3 probably wasn't at all clear and so 
mod_python was never updated to meet that requirement. As per the JIRA issue 
referenced above it is a known problem that code isn't meeting this 
requirement, but not much development has been done on mod_python for quite a 
while.

I have though recently made changes to personal copy of mod_python code such 
that it uses simplified GIL API for all access against first interpreter and it 
no longer suffers that assertion failure when Py_DEBUG defined. The code also 
should work for any modules which use simplified GIL API, such as SWIG 
generated bindings for Xapian. You do have to force the application using such 
modules to run under first interpreter.

The code for mod_wsgi uses simplified GIL API for first interpreter as well and 
works with SWIG generated bindings, but it is possible that it may still fail 
that assertion when Py_DEBUG is defined. This is because in order to allow 
mod_python and mod_wsgi to be used in Apache at the same time, mod_wsgi had to 
incorporate some hacks to workaround the fact that mod_python was not using 
simplified GIL API for first interpreter, but also because mod_python wasn't 
releasing the GIL for a critical section between when it was initialised and 
Apache child processes were created. It was in this section that mod_wsgi has 
to initialise itself and so it had to fiddle the thread states to be able to do 
its things. This workaround may have been enough to create additional thread 
state of a thread for first interpreter, thus later triggering the assertion.

It would have been nice to have mod_wsgi do the correct thing from the start, 
but that would have barred it being used at same time as mod_python and so 
people may have baulked at trying mod_wsgi as a result. Now that mod_wsgi has 
got some traction, in mod_wsgi version 3.0 it will be changed to remove the 
mod_python fiddle. This will mean that mod_wsgi 3.0 will not be usable at same 
time as current mod_python versions and would only be usable with the 
mod_python version (maybe 3.4) which I have made modifications for to also use 
simplified GIL APIs properly.

So that is the state of play as I see and understand it.

As to Adam's comments about use cases for multiple interpreters, we have had 
that discussion before and despite that many people rely on that feature in 
both mod_python and mod_wsgi he still continues to dismiss it outright and 
instead calls for complete removal of the feature.

Also Adam's comments that multiple interpreters were used in mod_wsgi only to 
support buggy third party software, that is untrue. Multiple interpreter 
support exists in mod_wsgi because mod_python provided a similar feature and 
mod_python existed before many of the Python web applications which are claimed 
to be the reason that sub interpreters are used in the first place. So, 
mod_python and use of sub interpreters came first, and not really the other way 
around. Where Python web applications do rely on os.environ it is historically 
because that is how things were done in CGI. Many such as Trac may still 
support that means of configuration as a fall back, but Trac now also supports 
other ways which are thread safe.

All up, use of distinct sub interpreters in mod_python and mod_wsgi is a valid 
concept in web hosting, one reason being because of the fact that it is simpler 
to use one Apache instance than many. Despite claims that sub interpreter 
support is completely broken it works fine for that use case because of the 
nature of web applications and the transient nature of individual requests and 
how sub interpreters persist for the life of the process and are not destroyed 
within the lifetime of the process.

Unfortunately some are quite blinkered to this and haven't taken the time to 
understand how it is being used. This is quite unfortunate, and rather than 
ripping out multiple interpreters, time might be better spent on improving it 
to work better as necessary if certain areas are still a problem.
msg70111 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-21 15:39
"Also, that Py_DEBUG check effectively says that if you use simplified 
GIL API for a particular thread against the first interpreter, you are 
prohibited from creating additional thread states for that thread."

I found that you cannot create additional thread states against the 
first interpreter and swap between them w/o this assertion occurring.  
I didn't use the GIL functions at all and had this issue in debug.  
PyInitialize initializes the GIL and hijacks the main interpreter.  We 
always call PyInitialize so does that mean we can only use the GIL 
functions with the main interpreter and nothing else when 
locking/unlocking the global lock as you seem to infer?  Does that mean 
there is a backward compatibility issue here with those who used the 
main interpreter only and created thread states from it to handle multi-
threading, like I did (thru the use of PyEval_Acquire/Release & 
PyThreadState_Swap)?
msg70112 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-21 15:42
By the way.  I switched to using the GIL functions on the main 
interpreter and everything works great now.  It is a better solution to 
use the GIL functions because I also had my own code that prevented 
dead lock from occuring when a python script calls back into the 
extension module that ends up calling PyEval_Acquire again (deadlock) 
even though it is the same thread.  Now with the GIL functions I don't 
need that code.  It is a good feature but it broke my previous 
implementation and it is not obvious why.
msg70113 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-21 17:22
Graham, I appreciate the history of sub-interpreters and how entrenched
they are.  Changing those practises requires a significant investment. 
This is an important factor to consider.

The other factor is the continuing maintenance and development cost. 
Subinterpreters add substantial complexity, which I can personally vouch
for.  This is exhibited in the GIL API not supporting them properly and
in the various bugs that have been found over the years.

Imagine, for a moment, that the situation were reversed; that everything
were built on threading.  Would you consider even for a moment adding
sub-interpreters?  How could you justify it?

It's not a decision to be taken lightly, but my preference is clear:
bite the bullet, make the change.  It's easier in the long run.
msg70187 - (view) Author: Graham Dumpleton (grahamd) Date: 2008-07-24 02:15
Franco, you said 'I found that you cannot create additional thread 
states against the  first interpreter and swap between them w/o this 
assertion occurring. ...'

Since the Py_DEBUG check is checking against the simplified GIL state 
API thread state object, then technically you could have a thread with 
multiple thread states, that thread just can't ever use/have used 
simplified GIL state API.

Take for example a system where threads are actually foreign threads and 
not created within Python. In this case simplified GIL state API thread 
state object would never have been created for that thread. For those 
you could have multiple thread states and not trip the test.

In other words, multiple thread states only blocked if one of them is 
the internal one created by simplified GIL state AP. This is getting 
hard to avoid though.

In summary, the simplified GIL state API is basically viral in nature.
msg70201 - (view) Author: Franco DiRosa (fdirosa) Date: 2008-07-24 12:30
I'm unsure if you are understanding what I'm doing so here is the 
story...

I stepped through Py_Initialize and this function takes the main 
interpreter and it's initial thread state and makes that the GIL thread 
state.

The following code in Py_Initialize hijacks the main interpreter and 
thread state for GIL use...

/* auto-thread-state API, if available */
#ifdef WITH_THREAD
    _PyGILState_Init(interp, tstate);
#endif /* WITH_THREAD */

WITH_THREAD is defined since I'm using multithreading in my application.

So now if you create thread states from the main interpeter and use the 
PyEval_Acquire/Release and PyThreadState_Swap you will get the 
assertion when compiled with the DEBUG option. If you use the 
PyGILState_Ensure and PyGILState_Release functions you don't. 

What I'm doing is that I have a Windows application with embedded 
python.  The application spawns multiple threads each running a python 
script.  Each application thread has its own unique PyThreadState 
created from the main interpreter because I wanted all the modules 
loaded only once for resource conservation purposes (thus use only one 
interpreter). I used PyEval_Acquire/Release and PyThreadState_Swap to 
handle swapping in each application thread's thread state when each one 
uses the python API.  This worked great in RELEASE compilation but in 
DEBUG it asserted.  Now that I use the GIL functions it works well and 
not only that, I removed the code I had put in myself to handle python 
callback's into the application and avoiding deadlocks by calling 
PyEval_Acquire onto itself (since it uses mutexes which doesn't do 
reference counting so it could deadlock waiting on itself to complete)
msg70206 - (view) Author: Graham Dumpleton (grahamd) Date: 2008-07-24 15:11
I do understand.

The initial thread, which is effectively a foreign thread to Python to 
begin with, when used to initialise Python, ie., call Py_Initialize(), 
is treated in a special way in as much as as a side effect it does that 
initialisation of GIL internal thread state. This is as you say. But, 
this is the only foreign thread this implicitly occurs for and why the 
main thread is a bit special.

If you were to create additional foreign threads outside of Python, ie., 
in addition to main thread which initialised it, those later threads 
should not fail the Py_DEBUG test unless the code they execute 
explicitly calls the simplified API and by doing so implicitly causes 
internal threadstate for that thread to be created.

Hope this makes sense. Sorry, in a bit of a hurry.
msg112642 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-03 18:12
Is this still an issue for 2.7 or 3.x?
Is it actually a Python issue or should it be closed?
msg112731 - (view) Author: Graham Dumpleton (grahamd) Date: 2010-08-04 00:43
The actual reported problem was likely because of known issues with running subversion Python wrappers in a sub interpreter.

The rest of the conversation was for a completely different issue which relates to mod_python not using thread APIs in Python in the required manner.

In both cases it is a package distinct from Python itself.

The only fault in Python is the inadequate documentation describing the intricacies of using threading APIs in Python, especially in relation to sub interpreters and also user created thread state objects against main interpreter.

As far as the original issue is concerned however, issue should be able to be closed.
msg112732 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-04 01:15
OK. If Graham or anyone has concrete suggestions for improving the current 3.2a1 doc for threading, open a fresh issue.
History
Date User Action Args
2010-08-04 01:15:34terry.reedysetstatus: open -> closed
resolution: not a bug
messages: + msg112732
2010-08-04 00:43:38grahamdsetmessages: + msg112731
2010-08-03 18:12:23terry.reedysetnosy: + terry.reedy
messages: + msg112642
2008-07-24 15:11:56grahamdsetmessages: + msg70206
2008-07-24 12:30:48fdirosasetmessages: + msg70201
2008-07-24 02:15:03grahamdsetmessages: + msg70187
2008-07-21 17:22:54Rhamphoryncussetmessages: + msg70113
2008-07-21 15:42:09fdirosasetmessages: + msg70112
2008-07-21 15:39:23fdirosasetmessages: + msg70111
2008-07-21 14:03:29grahamdsetnosy: + grahamd
messages: + msg70109
2008-07-09 14:13:12fdirosasetmessages: + msg69471
2008-07-09 05:26:03Rhamphoryncussetmessages: + msg69459
2008-07-09 04:23:27fdirosasetmessages: + msg69458
2008-07-09 04:21:21fdirosasetmessages: + msg69457
2008-07-08 22:25:27Rhamphoryncussetmessages: + msg69445
2008-07-08 22:05:00fdirosasetmessages: + msg69444
2008-07-08 21:10:10Rhamphoryncussetmessages: + msg69442
2008-07-08 20:11:05vslaviksetmessages: + msg69440
2008-07-08 19:56:46Rhamphoryncussetmessages: + msg69439
2008-06-04 06:59:32vslaviksetmessages: + msg67685
2008-06-04 00:50:15benjamin.petersonsettype: crash
2008-06-03 23:58:29Rhamphoryncussetmessages: + msg67679
2008-06-03 22:58:31vslaviksetmessages: + msg67678
2008-06-03 22:44:47Rhamphoryncussetnosy: + Rhamphoryncus
messages: + msg67677
2008-05-28 16:55:22fdirosasetnosy: + fdirosa
messages: + msg67459
2008-05-21 17:53:08vslaviksetfiles: + valgrind.log.gz
nosy: + vslavik
messages: + msg67163
2007-07-21 17:12:25timbishopcreate