classification
Title: segfault in for loop with evil iterator
Type: crash Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, amaury.forgeotdarc, arigo, benjamin.peterson, donmez, gideon, gregory.p.smith, gvanrossum, haypo, jcea, rhettinger, terry.reedy
Priority: normal Keywords: 26backport

Created on 2008-08-29 04:19 by gideon, last changed 2010-11-12 05:26 by terry.reedy. This issue is now closed.

Files
File name Uploaded Description Edit
baditerator.py gideon, 2008-08-29 04:19 example of iterator crashing in a for loop
baditer.patch amaury.forgeotdarc, 2008-08-29 13:58
baditer-2.6.patch amaury.forgeotdarc, 2008-08-29 13:59
itercrashers.diff ajaksu2, 2008-08-29 23:17 Fix Armin's examples
next-nevernull-2.patch amaury.forgeotdarc, 2008-09-02 16:54
next-nevernull-26.patch amaury.forgeotdarc, 2009-01-13 00:47 binary-compatible patch for 2.6
Messages (29)
msg72119 - (view) Author: Gideon Smeding (gideon) Date: 2008-08-29 04:19
The attached example crashes python. The crash is caused by an evil 
iterator that removes its own next function.
msg72130 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 12:34
Good catch! Here is a patch for 3.0.
msg72133 - (view) Author: Ismail Donmez (donmez) * Date: 2008-08-29 13:33
Maybe do a s/"object is no more an iterator"/"is no longer an iterator"
msg72135 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 13:59
New patches, with the suggested spelling.
For 3.0 and 2.6.
msg72148 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 18:33
It would be a damned shame to slow down the entire language to save 
code that is intentionally shooting itself in the foot.  EVERYONE will 
pay a cost -- NO ONE will benefit.  My two cents.
msg72155 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-29 20:18
Here are some timings, on winXP, vs2008 release build:

# t.py
def f(l=range(5000)): 
    for x in l: pass

# before the patch
> python -m timeit -s "from t import f" "f()"
10000 loops, best of 3: 159 usec per loop

# after the patch
> python -m timeit -s "from t import f" "f()"
10000 loops, best of 3: 160 usec per loop

and these figures are pretty stable on my machine.
Is it too much to pay? Some may consider that potential crashes are more
expensive.
msg72158 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 20:34
Go for it.  There's really no question about fixing possible 
segfaulters.  Was just voicing my displeasure at this sort of 
exercise.  The code has been around since at least 2.2 and hasn't 
caused the slightest problem.  It bugs me to put cruft in the middle of 
otherwise tight loops.  Fortunately, the check is cheap and branch 
predictable.

BTW, your timings are domained by the function call and the range(5000) 
step.  To cleanly time for-loop overhead, use:

   python -m timeit "pass"
msg72159 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-08-29 21:23
The same approach can be used to segfault many more places.  See
http://svn.python.org/projects/python/trunk/Lib/test/crashers/iter.py .
msg72171 - (view) Author: Daniel Diniz (ajaksu2) Date: 2008-08-29 23:11
This patch fixes Armin's list of crashers for trunk. Looking for others
like them.
msg72172 - (view) Author: Daniel Diniz (ajaksu2) Date: 2008-08-29 23:17
Hopefully the right patch this time :/
msg72173 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-29 23:38
Amaury, if you decide to go forward with this, please clean-up the 
patches to eliminate common subexpressions. 

Wonder if there is an alternate solution that keeps the next slot 
filled with something that raises an Attribute error.
msg72175 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2008-08-30 00:03
I share Raymond's annoyance.  The ultimate solution for segfaults is for
bad pointer references to be catchable (trappable) the same as math
errors are are now.  I remember when 1/0 was fatal, not EDOM.  Then the
interpreter could print a traceback and SegfaultException message ;-)

For this problem, I think there are alternatives to the current patch,
which as Armin points out, would need to be extended to every built-in
use of iterators.

Is there any (sensible) way to make .__next__ (or the underlying
tp_iternext slot) undelete-able?  We already cannot change methods of
built-ins.  Or replace with def __next__(self): raise StopIteration.

Or could iter() temporarily lock iterators in some way similar to what
happens to dict during iteration?  (Modifying actually does the action,
and then raises RuntimeError).  Modifying an active iterator strikes me
as even less sane than modifying an underlying dict.

The basic problem is that C code (unnecessarily?) does the same pointer
tracing each iteration.  Why not calculate np = *v->ob_type->tp_iternext
just once?  and just (more quickly) call np(v) each iteration?

One could claim this is a semantic change, but so was the change to dicts.  

The equivalent in Python would be

class BadIterator() :
	def __iter__(self) :
		return self
	def __next__(self) : # should be __next__ for python 3.0
		del BadIterator.__next__
		return 1

itnext = BadIterator().__next__
print(itnext())
print(itnext())

The second itnext call only fails because the del statement fails with
AttributeError: __next__.  I presume it would do the same if called from C.

(My annoyance with Py3 changing .next to .__next__ is that it makes it
harder to do the 'right thing' when iterating explicitly, which to me it
to use a bound method.  I wish next(it) returned it.__next__ instead of
calling it.)
msg72177 - (view) Author: Daniel Diniz (ajaksu2) Date: 2008-08-30 00:09
Raymond, I think a different solution would be great, as the performance
penalty might become nasty in tight loops if we miss some detail. 

Regarding the possible impact, I hope we can get a better estimate since
the other examples of segfaulting that look like Armin's I've found are
in itertools. I imagine you have the right tests to check the effect of
any changes there.
msg72180 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-30 00:21
It's not just performance -- the patch code is grotesque.  Lots of code 
would need to be changed just before the release candidate (usually a 
bad idea).  And the underlying problem doesn't seem to be one that has 
*ever* affected a real user since 2.2.  I have a hard time caring much 
about this problem.  

The minor performance hit only bugs me because it affects the inner-
loops of just about every piece of real python code -- everyone will 
pay a cost for this change.

Since the "problem" has existed so long with no ill effect, am 
unmarking the "release blocker" priority.  Would rather have a 
thoughtful discussion on alternatives along with a careful, thorough, 
efficient patch for a bug release version.
msg72202 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-30 22:55
> Amaury, if you decide to go forward with this, please clean-up the 
> patches to eliminate common subexpressions. 
I already considered this, but generated machine code was identical, so
I chose the more readable code.

> Wonder if there is an alternate solution that keeps the next slot 
> filled with something that raises an Attribute error.

That's an interesting idea, and not difficult to implement, see attached
patch. The penalty here is an extra comparison in PyIter_Check(). And no
change in the event loop...
Armin's crashers now correctly raise a TypeError (and the final patch
should convert them as unit tests)
msg72203 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-31 03:09
It was the ajaksu2 patches that needed clean-up.
msg72204 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-08-31 03:20
The next-nevernull patch is much cleaner than I expected.  Nice work.

The assertion in abstract.c can be changed to:
   assert(iter==PyObject_NextNotImplemented || PyIter_Check(iter));

Armin, are you happy with the new approach?  Though "del obj.next" 
isn't doing exactly what you would expect, that seems harmless to me.
msg72205 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-08-31 09:21
Hacking at typeobject.c should always be done extremely carefully.  I
don't have time to review this patch as thouroughly as I think it needs
to be.  (A quick issue is that it seems to break PyIter_Check() which
will always return true now, but I'd recommend a much more thourough
review.)
msg72301 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-01 22:01
Did you notice that the definition of PyIter_Check() also changed?

>>> class T(object):
...   def __iter__(self): return self
...
>>> iter(T())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: iter() returned non-iterator of type 'T'

Or did you refer to .so extensions modules that are not recompiled
between 2.5 and 2.6? (I don't know if this still works)
msg72349 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2008-09-02 16:07
Maybe the file 'next-nevernull.patch' is not complete?  I don't see any
change in the definition of PyIter_Check().
msg72351 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-02 16:54
Oops, sorry. I have too many files opened at the same time.
Here is an updated patch.

I removed all the "assert(PyIter_Check(it))", our evil iterator used to
make them fail anyway in debug mode; and in the case of a null function
pointer, the C stack gives the same information.
msg79693 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009-01-12 18:46
I think PyObject_NextNotImplemented should be renamed to
_PyObject_NextNotImplemented. Aside from that, I think the patch is
ready for committing.
msg79709 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-13 00:00
Any crash is a potential security problem.

This needs to be fixed in 2.6.x without breaking 2.6.0 extension 
compatibility.  (requiring extensions to be recompiled to fix the 
problem for an extension is fine, but they still need to load and work 
normally even if they have not been)

As trunk r68560 can't be backported?  what will?  Is there an 
appropriate pre-existing hackish substitute for PyObj_NextNotImplemented 
or will it require adding a test anytime the tp_nextiter slot is used?
msg79710 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-13 00:03
Fixed in r68560 (trunk) and r68561 (py3k).

Difficult to backport: extensions compiled with 2.6.x would not run on 
2.6.0.
msg79711 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-01-13 00:47
Yes, a hack: What about setting tp_iternext to PyObject_GetIter? they happen 
to have the same signature.
Yes, calling next() will call iter() instead; but an iterator is often its 
own iterator, and more importantly, PyIter_Check() is also called.

And the error message is surprisingly almost correct:
  "TypeError: iter() returned non-iterator of type 'BadIterator'"
It's not completely correct since the error occurred while calling the 
__next__() method of the iterator.

See attached patch for 2.6.

I can see one cause of incompatibility: if someone designed an extension 
type in C where tp_iternext is already PyObject_GetIter. It's is insane but 
valid, and the patch would break it. It's not worth the trouble I suppose...
msg79717 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-13 03:49
nice hack! :)

I'm going to guess that existing code in the wild setting tp_iternext = 
&PyObject_GetIter is rare.  I certainly can not rule it entirely out but 
I don't see anything in the open source world using 
http://www.google.com/codesearch

I'd be okay with this hack.  Let the release manager and/or BDFL make 
this call.
msg86204 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2009-04-20 19:51
I'm okay with that hack for 2.6 and 2.5.
msg104864 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-05-03 18:05
The fix works for 3.1.2, giving
TypeError: iter() returned non-iterator of type 'BadIterator'

Too late to patch 2.5. Is there any intention of patching 2.6 before its final bug-fix release, in a couple of months?
msg121020 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-11-12 05:26
Too late for 2.6.6 ;-)
History
Date User Action Args
2010-11-12 05:26:19terry.reedysetstatus: open -> closed

messages: + msg121020
2010-05-03 18:05:08terry.reedysetkeywords: - patch, needs review

messages: + msg104864
versions: - Python 2.5, Python 3.0
2010-05-03 06:06:20belopolskysetkeywords: + 26backport
2009-04-20 19:51:11gvanrossumsetnosy: + gvanrossum
messages: + msg86204
2009-01-13 03:49:49gregory.p.smithsetmessages: + msg79717
2009-01-13 00:47:52amaury.forgeotdarcsetstatus: pending -> open
files: + next-nevernull-26.patch
messages: + msg79711
keywords: + needs review
2009-01-13 00:03:34amaury.forgeotdarcsetstatus: open -> pending
keywords: - needs review
messages: + msg79710
resolution: fixed
2009-01-13 00:00:25gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg79709
components: + Interpreter Core
2009-01-12 18:46:23benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg79693
2009-01-09 17:38:08benjamin.petersonlinkissue4897 superseder
2008-11-10 11:18:59hayposetnosy: + haypo
2008-09-16 18:30:51jceasetnosy: + jcea
2008-09-02 16:54:45amaury.forgeotdarcsetfiles: - next-nevernull.patch
2008-09-02 16:54:38amaury.forgeotdarcsetfiles: + next-nevernull-2.patch
messages: + msg72351
2008-09-02 16:07:03arigosetmessages: + msg72349
2008-09-01 22:01:40amaury.forgeotdarcsetmessages: + msg72301
2008-08-31 09:21:33arigosetmessages: + msg72205
2008-08-31 03:20:24rhettingersetmessages: + msg72204
2008-08-31 03:09:45rhettingersetmessages: + msg72203
2008-08-30 22:55:39amaury.forgeotdarcsetfiles: + next-nevernull.patch
messages: + msg72202
2008-08-30 00:21:38rhettingersetpriority: release blocker -> normal
messages: + msg72180
2008-08-30 00:09:11ajaksu2setmessages: + msg72177
2008-08-30 00:03:40terry.reedysetnosy: + terry.reedy
messages: + msg72175
2008-08-29 23:38:51rhettingersetmessages: + msg72173
2008-08-29 23:17:47ajaksu2setfiles: + itercrashers.diff
messages: + msg72172
2008-08-29 23:16:20ajaksu2setfiles: - itercrashers.diff
2008-08-29 23:11:18ajaksu2setfiles: + itercrashers.diff
nosy: + ajaksu2
messages: + msg72171
versions: + Python 2.6
2008-08-29 21:23:13arigosetnosy: + arigo
messages: + msg72159
2008-08-29 20:34:42rhettingersetmessages: + msg72158
2008-08-29 20:18:05amaury.forgeotdarcsetmessages: + msg72155
2008-08-29 18:33:20rhettingersetnosy: + rhettinger
messages: + msg72148
2008-08-29 13:59:38amaury.forgeotdarcsetfiles: + baditer-2.6.patch
messages: + msg72135
2008-08-29 13:58:39amaury.forgeotdarcsetfiles: + baditer.patch
2008-08-29 13:58:22amaury.forgeotdarcsetfiles: - baditer.patch
2008-08-29 13:33:07donmezsetnosy: + donmez
messages: + msg72133
2008-08-29 12:34:30amaury.forgeotdarcsetfiles: + baditer.patch
priority: release blocker
messages: + msg72130
keywords: + needs review, patch
nosy: + amaury.forgeotdarc
2008-08-29 04:19:31gideoncreate