This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: str.replace causes segfault for long strings
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, python-dev, ronaldoussoren, serhiy.storchaka, tobiasmarschall
Priority: normal Keywords: needs review, patch

Created on 2013-07-11 09:40 by tobiasmarschall, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue-18427-py27.txt ronaldoussoren, 2013-07-11 10:50
Messages (12)
msg192855 - (view) Author: Tobias Marschall (tobiasmarschall) Date: 2013-07-11 09:40
Running the following two lines of code causes a segfault:

s = 'A' * 3142044943 
t = s.replace('\n','')

My setup:
Python 2.7.5 (default, Jul 11 2013, 11:17:50) 
[GCC 4.6.3] on linux2
Linux 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

I could reproduce this behavior on Python 2.6.8 and 2.5.6.

Please let me know if I can provide additional information.
msg192858 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 10:20
#0  memchr () at ../sysdeps/x86_64/memchr.S:155
#1  0x0000000000467ca4 in countchar (maxcount=<optimized out>, c=10 '\n', target_len=-1152922353, target=0x7fff3b196034 'A' <repeats 200 times>...)
    at Objects/stringobject.c:2341
#2  replace_delete_single_character (maxcount=<optimized out>, from_c=10 '\n', self=0x7fff3b196010) at Objects/stringobject.c:2427
#3  replace (maxcount=<optimized out>, to_len=<optimized out>, to_s=<optimized out>, from_len=1, from_s=<optimized out>, self=0x7fff3b196010)
    at Objects/stringobject.c:2780
#4  string_replace (self=0x7fff3b196010, args=<optimized out>) at Objects/stringobject.c:2854

target_len=-1152922353 looks fishy. I think countchar()'s target_len argument has a wrong argument type. It should be Py_ssize_t instead of int.

-countchar(const char *target, int target_len, char c, Py_ssize_t maxcount)
+countchar(const char *target, Py_ssize_t target_len, char c, Py_ssize_t maxcount)
msg192859 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 10:21
PS: The test case no longer segfaults with Py_ssize_t.
msg192861 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:39
I agree: target_len should have type Py_ssize_t.

It's probably worthwhile to check other functions as well.
msg192862 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:45
There is also this one in string_print:

        fwrite(data, 1, (int)size, fp);

"size" is a Py_ssize_t variable with the length of the string, and is casted to int which looses information. The exected argument to fwrite is size_t...

These two appear to be the only suspect uses of 'int' in stringobject.h.
msg192863 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 10:50
The attached patch should fix both issues (but doesn't contain a test, not sure if its worthwhile to add a testcase that uses more than 4 GByte of memory)

FWIW: the corresponding code in Objects/bytesobject.c in the 3.3 and default branches is already correct.
msg192865 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-07-11 11:05
LGTM!
msg192866 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-11 11:16
Great. I'll apply after running the full testsuite, and after rebooting my machine as it doesn't really like using that much additional memory and pushed some applications to disk :-(
msg192867 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-11 11:35
New changeset 2921f6c2009e by Ronald Oussoren in branch '2.7':
Issue #18427: str.replace could crash the interpreter with huge strings.
http://hg.python.org/cpython/rev/2921f6c2009e
msg192892 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-07-11 19:47
I think NEWS entry for this issue should be in the "Core and Builtins" section.
msg193117 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-07-15 16:35
New changeset 06d9f106c57e by Ronald Oussoren in branch '2.7':
Move entry from #18427 to the right section in the NEWS file
http://hg.python.org/cpython/rev/06d9f106c57e
msg193118 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-15 16:36
Serhiy: you're right. I've moved the message to the correct section.
History
Date User Action Args
2022-04-11 14:57:47adminsetgithub: 62627
2013-07-15 16:36:35ronaldoussorensetmessages: + msg193118
2013-07-15 16:35:57python-devsetmessages: + msg193117
2013-07-11 19:47:24serhiy.storchakasetmessages: + msg192892
2013-07-11 11:35:53ronaldoussorensetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-07-11 11:35:28python-devsetnosy: + python-dev
messages: + msg192867
2013-07-11 11:16:52ronaldoussorensetmessages: + msg192866
2013-07-11 11:05:31christian.heimessetmessages: + msg192865
2013-07-11 10:50:11ronaldoussorensetkeywords: + patch, needs review
files: + issue-18427-py27.txt
messages: + msg192863

stage: patch review
2013-07-11 10:45:39ronaldoussorensetmessages: + msg192862
2013-07-11 10:39:30ronaldoussorensetnosy: + ronaldoussoren
messages: + msg192861
2013-07-11 10:21:23christian.heimessetmessages: + msg192859
2013-07-11 10:20:30christian.heimessetnosy: + christian.heimes
messages: + msg192858
2013-07-11 10:06:09serhiy.storchakasetnosy: + serhiy.storchaka
2013-07-11 09:40:39tobiasmarschallcreate