msg72119 - (view) |
Author: Gideon Smeding (gideon) |
Date: 2008-08-29 04:19 |
The attached example crashes python. The crash is caused by an evil
iterator that removes its own next function.
|
msg72130 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-08-29 12:34 |
Good catch! Here is a patch for 3.0.
|
msg72133 - (view) |
Author: Ismail Donmez (donmez) * |
Date: 2008-08-29 13:33 |
Maybe do a s/"object is no more an iterator"/"is no longer an iterator"
|
msg72135 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-08-29 13:59 |
New patches, with the suggested spelling.
For 3.0 and 2.6.
|
msg72148 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-29 18:33 |
It would be a damned shame to slow down the entire language to save
code that is intentionally shooting itself in the foot. EVERYONE will
pay a cost -- NO ONE will benefit. My two cents.
|
msg72155 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-08-29 20:18 |
Here are some timings, on winXP, vs2008 release build:
# t.py
def f(l=range(5000)):
for x in l: pass
# before the patch
> python -m timeit -s "from t import f" "f()"
10000 loops, best of 3: 159 usec per loop
# after the patch
> python -m timeit -s "from t import f" "f()"
10000 loops, best of 3: 160 usec per loop
and these figures are pretty stable on my machine.
Is it too much to pay? Some may consider that potential crashes are more
expensive.
|
msg72158 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-29 20:34 |
Go for it. There's really no question about fixing possible
segfaulters. Was just voicing my displeasure at this sort of
exercise. The code has been around since at least 2.2 and hasn't
caused the slightest problem. It bugs me to put cruft in the middle of
otherwise tight loops. Fortunately, the check is cheap and branch
predictable.
BTW, your timings are domained by the function call and the range(5000)
step. To cleanly time for-loop overhead, use:
python -m timeit "pass"
|
msg72159 - (view) |
Author: Armin Rigo (arigo) * |
Date: 2008-08-29 21:23 |
The same approach can be used to segfault many more places. See
http://svn.python.org/projects/python/trunk/Lib/test/crashers/iter.py .
|
msg72171 - (view) |
Author: Daniel Diniz (ajaksu2) * |
Date: 2008-08-29 23:11 |
This patch fixes Armin's list of crashers for trunk. Looking for others
like them.
|
msg72172 - (view) |
Author: Daniel Diniz (ajaksu2) * |
Date: 2008-08-29 23:17 |
Hopefully the right patch this time :/
|
msg72173 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-29 23:38 |
Amaury, if you decide to go forward with this, please clean-up the
patches to eliminate common subexpressions.
Wonder if there is an alternate solution that keeps the next slot
filled with something that raises an Attribute error.
|
msg72175 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2008-08-30 00:03 |
I share Raymond's annoyance. The ultimate solution for segfaults is for
bad pointer references to be catchable (trappable) the same as math
errors are are now. I remember when 1/0 was fatal, not EDOM. Then the
interpreter could print a traceback and SegfaultException message ;-)
For this problem, I think there are alternatives to the current patch,
which as Armin points out, would need to be extended to every built-in
use of iterators.
Is there any (sensible) way to make .__next__ (or the underlying
tp_iternext slot) undelete-able? We already cannot change methods of
built-ins. Or replace with def __next__(self): raise StopIteration.
Or could iter() temporarily lock iterators in some way similar to what
happens to dict during iteration? (Modifying actually does the action,
and then raises RuntimeError). Modifying an active iterator strikes me
as even less sane than modifying an underlying dict.
The basic problem is that C code (unnecessarily?) does the same pointer
tracing each iteration. Why not calculate np = *v->ob_type->tp_iternext
just once? and just (more quickly) call np(v) each iteration?
One could claim this is a semantic change, but so was the change to dicts.
The equivalent in Python would be
class BadIterator() :
def __iter__(self) :
return self
def __next__(self) : # should be __next__ for python 3.0
del BadIterator.__next__
return 1
itnext = BadIterator().__next__
print(itnext())
print(itnext())
The second itnext call only fails because the del statement fails with
AttributeError: __next__. I presume it would do the same if called from C.
(My annoyance with Py3 changing .next to .__next__ is that it makes it
harder to do the 'right thing' when iterating explicitly, which to me it
to use a bound method. I wish next(it) returned it.__next__ instead of
calling it.)
|
msg72177 - (view) |
Author: Daniel Diniz (ajaksu2) * |
Date: 2008-08-30 00:09 |
Raymond, I think a different solution would be great, as the performance
penalty might become nasty in tight loops if we miss some detail.
Regarding the possible impact, I hope we can get a better estimate since
the other examples of segfaulting that look like Armin's I've found are
in itertools. I imagine you have the right tests to check the effect of
any changes there.
|
msg72180 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-30 00:21 |
It's not just performance -- the patch code is grotesque. Lots of code
would need to be changed just before the release candidate (usually a
bad idea). And the underlying problem doesn't seem to be one that has
*ever* affected a real user since 2.2. I have a hard time caring much
about this problem.
The minor performance hit only bugs me because it affects the inner-
loops of just about every piece of real python code -- everyone will
pay a cost for this change.
Since the "problem" has existed so long with no ill effect, am
unmarking the "release blocker" priority. Would rather have a
thoughtful discussion on alternatives along with a careful, thorough,
efficient patch for a bug release version.
|
msg72202 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-08-30 22:55 |
> Amaury, if you decide to go forward with this, please clean-up the
> patches to eliminate common subexpressions.
I already considered this, but generated machine code was identical, so
I chose the more readable code.
> Wonder if there is an alternate solution that keeps the next slot
> filled with something that raises an Attribute error.
That's an interesting idea, and not difficult to implement, see attached
patch. The penalty here is an extra comparison in PyIter_Check(). And no
change in the event loop...
Armin's crashers now correctly raise a TypeError (and the final patch
should convert them as unit tests)
|
msg72203 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-31 03:09 |
It was the ajaksu2 patches that needed clean-up.
|
msg72204 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-08-31 03:20 |
The next-nevernull patch is much cleaner than I expected. Nice work.
The assertion in abstract.c can be changed to:
assert(iter==PyObject_NextNotImplemented || PyIter_Check(iter));
Armin, are you happy with the new approach? Though "del obj.next"
isn't doing exactly what you would expect, that seems harmless to me.
|
msg72205 - (view) |
Author: Armin Rigo (arigo) * |
Date: 2008-08-31 09:21 |
Hacking at typeobject.c should always be done extremely carefully. I
don't have time to review this patch as thouroughly as I think it needs
to be. (A quick issue is that it seems to break PyIter_Check() which
will always return true now, but I'd recommend a much more thourough
review.)
|
msg72301 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-09-01 22:01 |
Did you notice that the definition of PyIter_Check() also changed?
>>> class T(object):
... def __iter__(self): return self
...
>>> iter(T())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: iter() returned non-iterator of type 'T'
Or did you refer to .so extensions modules that are not recompiled
between 2.5 and 2.6? (I don't know if this still works)
|
msg72349 - (view) |
Author: Armin Rigo (arigo) * |
Date: 2008-09-02 16:07 |
Maybe the file 'next-nevernull.patch' is not complete? I don't see any
change in the definition of PyIter_Check().
|
msg72351 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-09-02 16:54 |
Oops, sorry. I have too many files opened at the same time.
Here is an updated patch.
I removed all the "assert(PyIter_Check(it))", our evil iterator used to
make them fail anyway in debug mode; and in the case of a null function
pointer, the C stack gives the same information.
|
msg79693 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2009-01-12 18:46 |
I think PyObject_NextNotImplemented should be renamed to
_PyObject_NextNotImplemented. Aside from that, I think the patch is
ready for committing.
|
msg79709 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2009-01-13 00:00 |
Any crash is a potential security problem.
This needs to be fixed in 2.6.x without breaking 2.6.0 extension
compatibility. (requiring extensions to be recompiled to fix the
problem for an extension is fine, but they still need to load and work
normally even if they have not been)
As trunk r68560 can't be backported? what will? Is there an
appropriate pre-existing hackish substitute for PyObj_NextNotImplemented
or will it require adding a test anytime the tp_nextiter slot is used?
|
msg79710 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2009-01-13 00:03 |
Fixed in r68560 (trunk) and r68561 (py3k).
Difficult to backport: extensions compiled with 2.6.x would not run on
2.6.0.
|
msg79711 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2009-01-13 00:47 |
Yes, a hack: What about setting tp_iternext to PyObject_GetIter? they happen
to have the same signature.
Yes, calling next() will call iter() instead; but an iterator is often its
own iterator, and more importantly, PyIter_Check() is also called.
And the error message is surprisingly almost correct:
"TypeError: iter() returned non-iterator of type 'BadIterator'"
It's not completely correct since the error occurred while calling the
__next__() method of the iterator.
See attached patch for 2.6.
I can see one cause of incompatibility: if someone designed an extension
type in C where tp_iternext is already PyObject_GetIter. It's is insane but
valid, and the patch would break it. It's not worth the trouble I suppose...
|
msg79717 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2009-01-13 03:49 |
nice hack! :)
I'm going to guess that existing code in the wild setting tp_iternext =
&PyObject_GetIter is rare. I certainly can not rule it entirely out but
I don't see anything in the open source world using
http://www.google.com/codesearch
I'd be okay with this hack. Let the release manager and/or BDFL make
this call.
|
msg86204 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2009-04-20 19:51 |
I'm okay with that hack for 2.6 and 2.5.
|
msg104864 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2010-05-03 18:05 |
The fix works for 3.1.2, giving
TypeError: iter() returned non-iterator of type 'BadIterator'
Too late to patch 2.5. Is there any intention of patching 2.6 before its final bug-fix release, in a couple of months?
|
msg121020 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2010-11-12 05:26 |
Too late for 2.6.6 ;-)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:38 | admin | set | github: 47970 |
2010-11-12 05:26:19 | terry.reedy | set | status: open -> closed
messages:
+ msg121020 |
2010-05-03 18:05:08 | terry.reedy | set | keywords:
- patch, needs review
messages:
+ msg104864 versions:
- Python 2.5, Python 3.0 |
2010-05-03 06:06:20 | belopolsky | set | keywords:
+ 26backport |
2009-04-20 19:51:11 | gvanrossum | set | nosy:
+ gvanrossum messages:
+ msg86204
|
2009-01-13 03:49:49 | gregory.p.smith | set | messages:
+ msg79717 |
2009-01-13 00:47:52 | amaury.forgeotdarc | set | status: pending -> open files:
+ next-nevernull-26.patch messages:
+ msg79711 keywords:
+ needs review |
2009-01-13 00:03:34 | amaury.forgeotdarc | set | status: open -> pending keywords:
- needs review messages:
+ msg79710 resolution: fixed |
2009-01-13 00:00:25 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg79709 components:
+ Interpreter Core |
2009-01-12 18:46:23 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages:
+ msg79693 |
2009-01-09 17:38:08 | benjamin.peterson | link | issue4897 superseder |
2008-11-10 11:18:59 | vstinner | set | nosy:
+ vstinner |
2008-09-16 18:30:51 | jcea | set | nosy:
+ jcea |
2008-09-02 16:54:45 | amaury.forgeotdarc | set | files:
- next-nevernull.patch |
2008-09-02 16:54:38 | amaury.forgeotdarc | set | files:
+ next-nevernull-2.patch messages:
+ msg72351 |
2008-09-02 16:07:03 | arigo | set | messages:
+ msg72349 |
2008-09-01 22:01:40 | amaury.forgeotdarc | set | messages:
+ msg72301 |
2008-08-31 09:21:33 | arigo | set | messages:
+ msg72205 |
2008-08-31 03:20:24 | rhettinger | set | messages:
+ msg72204 |
2008-08-31 03:09:45 | rhettinger | set | messages:
+ msg72203 |
2008-08-30 22:55:39 | amaury.forgeotdarc | set | files:
+ next-nevernull.patch messages:
+ msg72202 |
2008-08-30 00:21:38 | rhettinger | set | priority: release blocker -> normal messages:
+ msg72180 |
2008-08-30 00:09:11 | ajaksu2 | set | messages:
+ msg72177 |
2008-08-30 00:03:40 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg72175 |
2008-08-29 23:38:51 | rhettinger | set | messages:
+ msg72173 |
2008-08-29 23:17:47 | ajaksu2 | set | files:
+ itercrashers.diff messages:
+ msg72172 |
2008-08-29 23:16:20 | ajaksu2 | set | files:
- itercrashers.diff |
2008-08-29 23:11:18 | ajaksu2 | set | files:
+ itercrashers.diff nosy:
+ ajaksu2 messages:
+ msg72171 versions:
+ Python 2.6 |
2008-08-29 21:23:13 | arigo | set | nosy:
+ arigo messages:
+ msg72159 |
2008-08-29 20:34:42 | rhettinger | set | messages:
+ msg72158 |
2008-08-29 20:18:05 | amaury.forgeotdarc | set | messages:
+ msg72155 |
2008-08-29 18:33:20 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg72148 |
2008-08-29 13:59:38 | amaury.forgeotdarc | set | files:
+ baditer-2.6.patch messages:
+ msg72135 |
2008-08-29 13:58:39 | amaury.forgeotdarc | set | files:
+ baditer.patch |
2008-08-29 13:58:22 | amaury.forgeotdarc | set | files:
- baditer.patch |
2008-08-29 13:33:07 | donmez | set | nosy:
+ donmez messages:
+ msg72133 |
2008-08-29 12:34:30 | amaury.forgeotdarc | set | files:
+ baditer.patch priority: release blocker messages:
+ msg72130 keywords:
+ needs review, patch nosy:
+ amaury.forgeotdarc |
2008-08-29 04:19:31 | gideon | create | |