Issue6763
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009-08-22 21:26 by santagada, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
threadboom.py | santagada, 2009-08-23 02:49 |
Messages (9) | |||
---|---|---|---|
msg91876 - (view) | Author: Leonardo Santagada (santagada) | Date: 2009-08-22 21:26 | |
Python 2.6.2 (and the maint branch if using old mimetypes.py) crash (with a bus error) on mac os x (10.5.7 & 10.5.8) with the file I posted. The problem appears to be in the allocation of memory by the GC. What I do is I call mimetypes.guess_type in more than one thread at the same time, then I guess what is happening is this: 1. The first thread to run notices mimetypes.inited is false so it call its init funtion. 2. Somehow the first thread loses the gil while still executing the init 3. Another thread tries to execute guess_type as it is already inited it calls itself, in vain as the init still hasn't exchanged it value for the new function so it goes into recursion 4. Somehow the allocator fails during the recursion here is the final pieces of my stack trace (its a very long sequence of recursions into guess_type): Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0xb0000ffc [Switching to process 61544 thread 0x117] 0x96912122 in szone_malloc () #0 0x96912122 in szone_malloc () #1 0x969120d8 in malloc_zone_malloc () #2 0x9691206c in malloc () #3 0x0006f32c in PyObject_Malloc (nbytes=376) at Objects/obmalloc.c:913 913 return (void *)malloc(nbytes); #4 0x0006fe61 in _PyObject_DebugMalloc (nbytes=360) at Objects/obmalloc.c:1347 1347 p = (uchar *)PyObject_Malloc(total); #5 0x00149b13 in _PyObject_GC_Malloc (basicsize=344) at Modules/gcmodule.c:1351 1351 g = (PyGC_Head *)PyObject_MALLOC( #6 0x00149c24 in _PyObject_GC_NewVar (tp=0x193500, nitems=5) at Modules/gcmodule.c:1383 1383 PyVarObject *op = (PyVarObject *) _PyObject_GC_Malloc(size); #7 0x00048a06 in PyFrame_New (tstate=0x33df30, code=0x473148, globals=0x48e380, locals=0x0) at Objects/frameobject.c:642 642 f = PyObject_GC_NewVar(PyFrameObject, &PyFrame_Type, #8 0x00100816 in PyEval_EvalCodeEx (co=0x473148, globals=0x48e380, locals=0x0, args=0x374fb4, argcount=2, kws=0x374fbc, kwcount=0, defs=0x4a6f9c, defcount=1, closure=0x0) at Python/ceval.c:2755 2755 f = PyFrame_New(tstate, co, globals, locals); |
|||
msg91883 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2009-08-23 02:15 | |
Seems like issue 6626 could be helpful here. |
|||
msg91885 - (view) | Author: Leonardo Santagada (santagada) | Date: 2009-08-23 02:49 | |
Well, the mimetypes module from 2.6 maintenance branch make this problem not show up with mimetypes.guess_type, but I still think this is a bug because pure python code should not crash the interpreter right? I'm attaching the file I mentioned, I hope this counts as a test (it needs the mimetypes module from python 2.6.2). I can probably extract just the needed functions from the old mimetypes module if requested. |
|||
msg93828 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2009-10-10 13:24 | |
The thread safety problem comes from the fact that performing file IO as mimetypes.init() does will release the GIL - if you want to ensure thread safety in that context, you have to do your own locking. mimetypes ignore this thread syncrhonisation problem completely with unhelpful results. An attempt was made to address the race condition in r72045 by eliminating the infinite recursion. Instead, you just get init() being invoked multiple times on different MimeTypes instances, with the last one "winning" and being kept as the _db module global (which does eliminate the crash, but has problems of its own). This crash probably involves hitting the recursion limit and that's always a bit dicey as to whether we actually manage to trap it before the C stack goes boom. With this case being an infinite recursion in an __init__() method leading to an infinite number of a given object type being allocated, that's a bit special since it raises the prospect of potentially running out of heap memory (although that's unlikely with the default recursion limit unless there are an awful lot of threads involved). To check the simple failure mechnism, I tried threading out the following: class Broken(): def __init__(self): break_it() def break_it(): Broken() from threading import Thread threads = [Thread(target=break_it) for x in range(100)] for t in threads: t.start() On my machine, the threads fail with "RuntimeError: maximum recursion depth exceeded" for a recursion limit of 1000 or 10000, but segfault at 100,000. However, the 100k recursion limit segfaults even if I only use a single thread (i.e. call break_it() directly without involving the threading module at all). For the OP: What value do you get for sys.getrecursionlimit()? Do you still get the segfault if you use sys.setrecursionlimit() to lower the maximum allowed level of recursion? (e.g. limit it to 200 or 500 recursions) |
|||
msg93830 - (view) | Author: Leonardo Santagada (santagada) | Date: 2009-10-10 15:46 | |
I'm on os x 10.6 where threadboom.py doesn't segfault anymore at least on the system provided python. The problem that I see is that it shouldn't be segfaulting on mac os x 10.5 with the default recursion limit (I think it is 1000) with 2 threads. IIRC in a simple recursive function (not on object __init__ like you did) I could put 10000 or more as a recursion limit and still get a traceback, so I thought that 2 threads each with 1000 recursion limit should not be using the whole stack. Also I think I did try to raise the stack limit with ulimit, but I could be wrong. Nick Coghlan, did you do your experments on os x 10.5? Can you try threadboom.py on a python before the corrected mimetype lib landed (somewhere between 2.6.2 and 2.6.3) or with an old version of the mimetype lib? I got the same errors both on my old white macbook core duo machine and with a macbook pro core 2 duo and with both python 2.6.2 or 2.6.3 with the old mimetypes lib. I was worried with this bug because I guessed that maybe there is a race condition of some sort on object creation on python 2.6. If someone can reproduce the bug and understand the bug tell me it is a problem of stack size I would rest my case and be happy with the segfault :). ps: I will try to compile python2.6.2 here and reproduce the errors, if I can I will reply with more info. |
|||
msg93833 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2009-10-10 16:41 | |
Knew I forgot to mention something - I'm not on OS X at all (Linux, Ubuntu 8.04). I was only looking at this bug because RDM cross-linked it to the mimetypes patch I was reviewing this evening. Running the threadboom code, it passes fine for me on all of SVN head, the 2.6 maintenance branch and the system Python (2.5.2). I've added the OS X maintainer to the nosy list. |
|||
msg105081 - (view) | Author: Ronald Oussoren (ronaldoussoren) * | Date: 2010-05-05 19:47 | |
The script works fine for me (OSX 10.6.3, /usr/bin/python2.5, /usr/bin/python2.6, a recent build of 2.6.x, a recent build of 3.2 and the trunk) The breakit example in msg93828 works in 64-bit binaries, and fails on 32-bit ones. This is almost certainly a stack overrun: I can remove the crash by increasing the stacksize using thread.stacksize(N) for a sufficiently large value of N. (I don't mention a value for N because I don't know yet what the minimum size is to avoid the crash). There are three possible actions w.r.t. this: 1) Ignore the issue (users can call thread.stack_size when they want deep recursion in threads) 2) Reduce the recursion limit on OSX to something that fits in the default stack size of pthread 3) Increase the default stack size for new threads. I have a slight preference for the first choice, although the last choice would be fine too. Reducing the recursion limit would also harm code that uses deep recursion on the main thread, which is why I'd be -1 on that. |
|||
msg114792 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2010-08-24 14:41 | |
Issue1454481, which introduced the ability to set the thread stack size, indicates that the FreeBSD port maintainers were bumping the default limit higher. So I think (3) is probably the correct solution. |
|||
msg119211 - (view) | Author: Ronald Oussoren (ronaldoussoren) * | Date: 2010-10-20 14:56 | |
I'm closing this as a duplicate of #9670, that is: too deep recursion in a thread doesn't trigger the appropriate exception but causes a hard crash instead. I have attached a patch to that issue (but haven't applied it yet, I'd like someone else too look at the patch as well). BTW. I don't think this issue is serious enough to warrant a backport to 2.6. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:52 | admin | set | github: 51012 |
2010-10-20 14:56:28 | ronaldoussoren | set | status: open -> closed resolution: duplicate messages: + msg119211 |
2010-08-24 14:41:39 | r.david.murray | set | messages: + msg114792 |
2010-05-05 19:47:55 | ronaldoussoren | set | messages: + msg105081 |
2010-05-03 06:23:55 | belopolsky | set | assignee: ronaldoussoren components: + macOS nosy: ronaldoussoren, ncoghlan, r.david.murray, jab, jrus, santagada |
2009-10-10 16:41:14 | ncoghlan | set | nosy:
+ ronaldoussoren messages: + msg93833 |
2009-10-10 15:46:56 | santagada | set | messages: + msg93830 |
2009-10-10 13:24:49 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg93828 |
2009-08-23 02:49:23 | santagada | set | files:
+ threadboom.py messages: + msg91885 |
2009-08-23 02:17:03 | r.david.murray | set | nosy:
+ jrus |
2009-08-23 02:15:54 | r.david.murray | set | priority: high nosy: + r.david.murray messages: + msg91883 stage: test needed |
2009-08-23 00:46:28 | jab | set | nosy:
+ jab |
2009-08-22 21:26:34 | santagada | create |