This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: list(xrange(1e9)) --> seg fault
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: jlt63 Nosy List: holdenweb, jlt63, nnorwitz, rhettinger, tim.peters
Priority: normal Keywords:

Created on 2002-05-14 16:41 by rhettinger, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
list-patch nnorwitz, 2002-05-17 21:03 listobject.c patch to raise an exception if try to resize too much
test_nresize.diff rhettinger, 2002-05-18 00:12 patch to regression test bank
list-patch nnorwitz, 2002-05-20 15:43 updated patch for fix & test
test_b1.py.diff jlt63, 2002-08-12 11:26
Messages (26)
msg10797 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-05-14 16:41
From c.lang.py:

'''
 Python 2.2.1 (#2, Apr 21 2002, 22:22:55) 
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)] 
on linux2
Type "help", "copyright", "credits" or "license" for 
more information.
>>> list(xrange(1e9))
Segmentation fault
'''

I've reproduced the same fault on Windows ME using 
Py2.2.0 and Py2.3a.
msg10798 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-05-14 22:15
Logged In: YES 
user_id=31435

Heh.  This is an instance of a general flaw:  the 
PyMem_RESIZE macro doesn't check for int overflow in its

    (n) * sizeof(type)

subexpression. The basic deal is that 1000000000 fits in an 
int, but 4 times that (silently) overflows.  In more 
detail, for this specific test case, listobject.c'.s 
roundupsize rounds 1e9 up to 0x40000000, which silently 
underflows to 0!() when PyMem_RESIZE multiplies it by 4.

Hard to know how to fix this in general; PyMem_RESIZE 
callers don't generally worry about overflow now.
msg10799 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-05-15 00:44
Logged In: YES 
user_id=80475

Would there be some merit to converting PyMem_RESIZE to a 
function with an overflow check?  

I would guess that the time spent on a realloc dwarfs the 
overhead of a short wrapper function.

OTOH, I'll bet this core dump only occurs in toy examples 
and never in real code.

msg10800 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-17 20:22
Logged In: YES 
user_id=33168

Note, this problem is even more generic in CVS version:

>>> range(1.1)
Segmentation fault (core dumped)
>>> xrange(1.1)
Segmentation fault (core dumped)

[x]xrange(float) work fine in 2.2.1.
msg10801 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-17 20:25
Logged In: YES 
user_id=33168

Oops, sorry about that last comment.  That was something I
was playing with.  The CVS version is fine for [x]range(float).
msg10802 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-17 21:03
Logged In: YES 
user_id=33168

Ok, this time I have a patch.  The patch only fixes listobject.

I looked over the other uses of PyMem_R{ESIZE,ALLOC}() and
they don't appear to be nearly as problematic as list.  For
example, if the grammar has 1e9 nodes, there are going to be
other problems well before then (ie, memory will probably be
gone).
msg10803 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-05-18 00:12
Logged In: YES 
user_id=80475

Added a patch to add this bug to the testbank.

Shallow code review:  Patch compiles okay (applied to 
Py2.3a0). Fixes the original problem.  Passes the smell 
test.  Macro style good (only the "var" operand is used 
more than once; no side-effects except setting "var" to 
zero upon a resize error).  Passes the standard regression 
tests.  Passes regression testing versus my personal (real 
code) testbank.

Will give it a deeper look this week-end.

One other thought: should the ValueError be replaced with a 
MemoryError to make the message consistent with PyList_New?

msg10804 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-05-20 04:48
Logged In: YES 
user_id=31435

The code patch has some problems:  you can't assume any 
relation between a size_t and an unsigned long; C simply 
doesn't define how big size_t is, and relative sizes do 
vary on 64-bit platforms.  However that gets fixed, if you 
decide it's "too big", var should be set to NULL (not 0 -- 
this is a "Guido thing" <wink>), and no exception should be 
set.  It's the caller's responsibility to check var for 
NULL after the macro is invoked, and set an appropriate 
exception.  listobject.c sometimes doesn't check the result 
for NULL, but that should only be when it knows it's 
*shrinking* a memory area, so that realloc can't fail.
msg10805 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-20 15:43
Logged In: YES 
user_id=33168

Ok, there were other problems, too:
  * Need to divide by the size of the type, 
    not >> 4 which was completely broken.
  * There was a missing PyErr_NoMemory().

I uploaded a new patch.

I'm not sure the size_t fix is correct.
I hope we can at least assume 8-bit machines: :-)

if (_new_size <= ((1 << (sizeof(size_t)*8 - 1)) / sizeof(type)))

msg10806 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2002-05-21 02:51
Logged In: YES 
user_id=31435

Yup, lookin' better.  Python does assume 8-bit bytes in 
several places, and also 2's-complement integers.  Since 
size_t is guaranteed (by C) to be an unsigned type, the 
largest value of type size_t is more easily expressed as

(~(size_t)0)

The C part of the patch looks fine then.  The test is a 
little dubious:  who says the machine can't create a 
billion-integer list?  The idea that 1e9 necessarily 
overflows in this context is a 32-bit address-space 
assumption.  But I'm willing to delay fixing that until a 
machine with a usable larger address space appears <wink>.

So marked Accepted and assigned to you for checkin.  Thanks!
msg10807 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-05-21 03:37
Logged In: YES 
user_id=80475

When you load the fix, please commit the regression test 
patch also.  Thx,  Raymond
msg10808 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-21 12:52
Logged In: YES 
user_id=33168

How about using sys.maxint / 4?  Does that make more sense
than 1e9?  This assumption is a little better, that the data
and address sizes are the same.  I can add a comment to this
effect.
msg10809 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-05-21 12:59
Logged In: YES 
user_id=80475

Good plan!  Thx for squashing this bug. 
msg10810 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-05-22 23:20
Logged In: YES 
user_id=33168

Checked in as:
  listobject.c 2.106
  test_b1.py 1.46
msg10811 - (view) Author: Steve Holden (holdenweb) * (Python committer) Date: 2002-08-07 17:28
Logged In: YES 
user_id=88157

I hope re-opening this is the right thing to do (I'm new here).

Current CVS fails under Win2000+Cygwin with a 
segmentation fault in the updated test_b1.py. Easily 
reproduced:

$ ./python.exe
Python 2.3a0 (#1, Aug  7 2002, 12:18:38)
[GCC 2.95.3-5 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more 
information.
>>> import sys
>>> list(xrange(sys.maxint/4))
Segmentation fault (core dumped)

This does seem to be size-related, as:

>>> s = sys.maxint/8
>>> list(xrange(s))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
MemoryError

which is as expected in test_b1.py
msg10812 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-08-07 18:20
Logged In: YES 
user_id=33168

Hmmm, Tim can you reproduce this?  I luckily don't have a
windows box. :-)
msg10813 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-08-07 20:11
Logged In: YES 
user_id=33168

Actually, I think Jason may be more appropriate, since this
is a cygwin problem.  Jason, can you test/replicate this?
msg10814 - (view) Author: Jason Tishler (jlt63) * (Python triager) Date: 2002-08-08 12:11
Logged In: YES 
user_id=86216

> Jason, can you test/replicate this?

Yes, I've already been working on this one. See
the following mailing list threads for the details:

http://cygwin.com/ml/cygwin-developers/2002-07/msg00124.html
http://sources.redhat.com/ml/newlib/2002/msg00369.html

To summarize the above, the problem is actually
in newlib which provides Cygwin's libc (and
libm).  Unfortunately, Chris Falyor (the Cygwin
lead developer) has not been able to convince
the newlib maintainer to fix this problem.
Additionally, my first patch has been rejected.

I will continue my efforts to get this problem
resolved.  Any assistance will be greatly
appreciated.  I never expected to become an
expert in Doug Lea's malloc routines.  Sigh...
msg10815 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-08-09 00:56
Logged In: YES 
user_id=33168

I looked at the links.  I don't know what I can do to help.
 It seems like you proposed a reasonable solution and even
if it wasn't perfect, you still demonstrated a problem.  I
suppose I can only commiserate with you.
msg10816 - (view) Author: Jason Tishler (jlt63) * (Python triager) Date: 2002-08-09 18:09
Logged In: YES 
user_id=86216

Thanks for the sympathy.

I've tried, tried again:

http://sources.redhat.com/ml/newlib/2002/msg00390.html
msg10817 - (view) Author: Jason Tishler (jlt63) * (Python triager) Date: 2002-08-10 01:02
Logged In: YES 
user_id=86216

I guess that my perseverance paid off:

http://sources.redhat.com/ml/newlib/2002/msg00391.html

Hence, this bug will be fixed in Cygwin 1.3.13.

Can we close this bug now? Or, should we wait until
Cygwin 1.3.13 is released?
msg10818 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2002-08-10 03:08
Logged In: YES 
user_id=80475

My thought is to close the bug, but add a unittest that 
says in effect:  if os is cygwin and version(cygwin) >= 
1.3.13 and the bug still exists, then fail with a message 
saying that SF 556025 was not successfully resolved.

This way, we can close the bug (since it is not a python 
bug) and still get a regression test to raise the concern if 
the expected solution either doesn't materialize or 
sometime later dematerializes.
msg10819 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-08-10 14:27
Logged In: YES 
user_id=33168

I'm not sure if the test should fail, be skipped, or not run
for cygwin < 1.3.13.  But let's at least put a comment in
the test and close this bug report.  Jason, can you do that?

Good persistence Jason!
msg10820 - (view) Author: Jason Tishler (jlt63) * (Python triager) Date: 2002-08-12 11:26
Logged In: YES 
user_id=86216

rhettinger wrote:
> My thought is to close the bug, but add a
> unittest that says in effect:  if os is cygwin
> and version(cygwin) >= 1.3.13 and the bug still
> exists, then fail with a message saying that SF
> 556025 was not successfully resolved.

Do we really want to add cruft that is not only
Cygwin specific but Cygwin version specific?

nnorwitz wrote:
> I'm not sure if the test should fail, be
> skipped, or not run for cygwin < 1.3.13.

Agreed.

> But let's at least put a comment in the test and
> close this bug report.  Jason, can you do that?

Yes, but I only have pre-approved commit access to
the CVS repository. Can you approve the attached
patch?

> Good persistence Jason!

Thanks.
msg10821 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2002-08-12 13:38
Logged In: YES 
user_id=33168

You should probably add that it fails due to a bug in newlib
and not python.  Go ahead and check it in and close the bug
report.  Thanks.
msg10822 - (view) Author: Jason Tishler (jlt63) * (Python triager) Date: 2002-08-13 11:43
Logged In: YES 
user_id=86216

Committed as Lib/test/test_b1.py 1.51.
History
Date User Action Args
2022-04-10 16:05:19adminsetgithub: 36607
2002-05-14 16:41:03rhettingercreate