classification
Title: Object allocation stress leads to segfault on RHEL
Type: crash Stage:
Components: Interpreter Core Versions: Python 2.4, Python 2.6, Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ajg, ebfe, farshad, pitrou
Priority: normal Keywords:

Created on 2008-12-23 20:12 by ajg, last changed 2009-02-05 16:55 by farshad. This issue is now closed.

Files
File name Uploaded Description Edit
python_memtest.tbz ajg, 2008-12-23 20:12 memory stress test
Messages (9)
msg78249 - (view) Author: Andrew (ajg) Date: 2008-12-23 20:12
Allocating large numbers of strings objects has been causing Python to
segfault on RHEL.  Originally detected when sending large data
structures over XMLRPC, but also happens when appending large numbers of
small strings to a list and calling join in the list.

-- Crash is always a segmentation fault when accessing the free list to
allocate a python object (obmalloc.c) or an invalid pointer when
accessing a string object during a join operation (stringobject.c)
-- Happens on RHEL with latest updates and no modified RPMs.
-- Also happens with Python built from source on RHEL machine.
-- Crash happens on RHEL machines running on different hardware.
-- Reproducible with Python 2.4, 2.5, 2.6 and RHEL 5.1 and 5.2.
-- Problem not seed on FreeBSD 6.2 or 7.0.

Attached to the bug report is a script that is capable of recreating the
problem on various RHEL and Python versions.  If running three instances
of the script, usually one of them will segfault within 30 minutes.

Below is a backtrace from the crash.  This is in a call to join(),
concatenating a very long sequence of strings.  While iterating over the
sequence of string objects to get the memory needed, one of the string
objects appears to be located in invalid memory, resulting in segfault.

#0  string_join (self=0xb7ee3098, orig=0xb08696c)
    at Objects/stringobject.c:1776
#1  0x008d3749 in PyEval_EvalFrameEx (f=0xb29499c, throwflag=0)
    at Python/ceval.c:3561
#2  0x008d4922 in PyEval_EvalCodeEx (co=0xb7c690b0, globals=0xb7edcdfc,
    locals=0x0, args=0x9dc29dc, argcount=2, kws=0x9dc29e4, kwcount=0,
    defs=0xb7c6cb78, defcount=1, closure=0x0) at Python/ceval.c:2836
#3  0x008d284c in PyEval_EvalFrameEx (f=0x9dc2884, throwflag=0)
    at Python/ceval.c:3669
#4  0x008d32bd in PyEval_EvalFrameEx (f=0xaef0414, throwflag=0)
    at Python/ceval.c:3659
#5  0x008d4922 in PyEval_EvalCodeEx (co=0xa2f2920, globals=0xa2d90b4,
    locals=0x0, args=0xab29710, argcount=1, kws=0xab29714, kwcount=3,
    defs=0xa2f3420, defcount=4, closure=0x0) at Python/ceval.c:2836
#6  0x008d284c in PyEval_EvalFrameEx (f=0xab295ac, throwflag=0)
    at Python/ceval.c:3669
#7  0x008d32bd in PyEval_EvalFrameEx (f=0xa936f614, throwflag=0)
    at Python/ceval.c:3659
#8  0x008d4922 in PyEval_EvalCodeEx (co=0x8f1b218, globals=0x8f1546c,
    locals=0x0, args=0xb347338, argcount=1, kws=0xb34733c, kwcount=0,
    defs=0x8f17c98, defcount=1, closure=0x0) at Python/ceval.c:2836
#9  0x008d284c in PyEval_EvalFrameEx (f=0xb3471ec, throwflag=0)
    at Python/ceval.c:3669
#10 0x008d32bd in PyEval_EvalFrameEx (f=0xaa322eac, throwflag=0)
    at Python/ceval.c:3659
#11 0x008d32bd in PyEval_EvalFrameEx (f=0xad9c8b4, throwflag=0)
    at Python/ceval.c:3659
#12 0x008d32bd in PyEval_EvalFrameEx (f=0xaa3e62c, throwflag=0)
    at Python/ceval.c:3659
#13 0x008d32bd in PyEval_EvalFrameEx (f=0xa93ef14, throwflag=0)
    at Python/ceval.c:3659
#14 0x008d32bd in PyEval_EvalFrameEx (f=0xa6422c4, throwflag=0)
    at Python/ceval.c:3659
#15 0x008d4922 in PyEval_EvalCodeEx (co=0x8f28e30, globals=0x8f23604,
    locals=0x0, args=0xa528cd8, argcount=1, kws=0x0, kwcount=0, defs=0x0,
    defcount=0, closure=0x0) at Python/ceval.c:2836
#16 0x0087145a in function_call (func=0x8f2bf7c, arg=0xa528ccc, kw=0x0)
    at Objects/funcobject.c:517
#17 0x0084e917 in PyObject_Call (func=0x13e01, arg=0xa528ccc, kw=0x0)
    at Objects/abstract.c:1861
#18 0x008561a5 in instancemethod_call (func=0x9694284, arg=0xa528ccc,
kw=0x0)
    at Objects/classobject.c:2519
#19 0x0084e917 in PyObject_Call (func=0x13e01, arg=0xb7ee302c, kw=0x0)
    at Objects/abstract.c:1861
#20 0x008cc67c in PyEval_CallObjectWithKeywords (func=0x9694284,
    arg=0xb7ee302c, kw=0x0) at Python/ceval.c:3442
#21 0x00903394 in t_bootstrap (boot_raw=0xaaedf98)
    at ./Modules/threadmodule.c:424
#22 0x00cf745b in start_thread () from /lib/libpthread.so.0
#23 0x00c2fc4e in clone () from /lib/libc.so.6 

Here is the top of another backtrace that occurs when accessing a free
list to allocate a Python object:

#0  0x0808825b in PyObject_Malloc (nbytes=41) at Objects/obmalloc.c:747
#1  0x0808d998 in PyString_FromStringAndSize (str=0x0, size=17)
    at Objects/stringobject.c:75 
...
msg78384 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-27 20:26
I can't reproduce it under Mandriva Linux on an x86-64 machine.
msg78386 - (view) Author: Lukas Lueg (ebfe) Date: 2008-12-27 20:56
I can't reproduce the problem here.

Python 2.5.2 running on Linux lueg-desktop 2.6.24-22-generic #1 SMP Mon
Nov 24 18:32:42 UTC 2008 i686 GNU/Linux
msg78438 - (view) Author: Andrew (ajg) Date: 2008-12-29 04:11
Cannot reproduce this on RHEL 4.  So far only RHEL 5.x seems to be affected.
msg78450 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-29 12:22
I think you should report the bug to Redhat and see what they have to
say about it. It may be a bug in the libc of that particular version. In
any case I think it is highly inlikely to be a bug in Python itself.
msg78623 - (view) Author: Andrew (ajg) Date: 2008-12-31 16:24
This problem appears to be specific to RHEL 5, and is not a Python
problem.  Linking against Google malloc (libtcmalloc) fixes the issue.

This bug should be closed.
msg78624 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-12-31 16:25
Ok, thanks for the investigation!
msg81218 - (view) Author: Farshad Khoshkhui (farshad) Date: 2009-02-05 16:52
This happens for me on several debian and ubuntu machines with python
2.5 as well as 2.6 as I reported in #571885. I'll try your script and
linking with tcmalloc and get back with results.
msg81219 - (view) Author: Farshad Khoshkhui (farshad) Date: 2009-02-05 16:55
Sorry wrong issue number. The correct one is #4358
History
Date User Action Args
2009-02-05 16:55:15farshadsetmessages: + msg81219
2009-02-05 16:52:39farshadsetnosy: + farshad
messages: + msg81218
2008-12-31 16:25:12pitrousetstatus: open -> closed
resolution: not a bug
messages: + msg78624
2008-12-31 16:24:19ajgsetmessages: + msg78623
2008-12-29 12:22:26pitrousetmessages: + msg78450
2008-12-29 04:11:39ajgsetmessages: + msg78438
2008-12-27 20:56:21ebfesetnosy: + ebfe
messages: + msg78386
2008-12-27 20:26:15pitrousetnosy: + pitrou
messages: + msg78384
2008-12-23 20:12:31ajgcreate