msg95159 - (view) |
Author: Armin Rigo (arigo) * |
Date: 2009-11-12 10:50 |
The __str__ method of some exception classes reads attributes without
typechecking them. Alternatively, the issue could be that the user is
allowed to set the value of these attributes directly, without
typecheck. The typechecking is only done when we create the exception,
but not later. Example:
>>> u=UnicodeTranslateError(u'x', 1, 5, 'bah')
>>> u.reason = 0x345345345345345345
>>> str(u)
"can't translate characters in position 1-4: E\x03"
The 'E\x03' comes from PyString_AS_STRING(reason). By playing enough it
is probably possible to come up with a real crasher.
|
msg95225 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2009-11-14 02:06 |
I don't know if this is a real problem. If someone who want to crash
someone else program is able to do something like 'u.reason =
somethingweird' there are already more serious problems to solve.
I don't see why someone would want to do that in his own program either.
So, even assuming that PyString_AS_STRING() might indeed crash when some
weird arg is passed, the problem should be fixed there and not
typechecking all the args before calling it. (i.e. even if you fix it
for Exceptions, there are probably several other places where you can
set arbitrary things that will be passed to PyString_AS_STRING() anyway.)
That said, I played with it and tried to set u.reason with a number of
things (including big numbers and strings, Unicode chars outside the
BMP, builtin types, functions, modules) and str(u) either returned an
empty string or a random sequence of bytes (like your 'E\x03'), but it
didn't crash.
Unless you can find a way to make it crash, I'd close this as 'invalid'.
|
msg95228 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2009-11-14 06:24 |
After further investigations I found out that PyString_AS_STRING() is
the macro form of PyString_AsString() but without error checking (so
there's nothing to fix there, possibly just replace that call with
PyString_AsString if it turns out to be a real problem).
I think that what I said in the first paragraph of my previous message
is still true though, and that might be the reason why they preferred
PyString_AS_STRING() in the first place.
|
msg95230 - (view) |
Author: Andreas Stührk (Trundle) * |
Date: 2009-11-14 10:18 |
Crashes reliable with a segfault in Python 3.1.1.
Fixing the setter so that one can only set strings and not arbitrary
objects is possibly the best solution.
|
msg95232 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-14 10:49 |
I'm not sure why reason should be restricted to a string. This patch
(against trunk) just converts reason to a string when str() is called.
I'll add tests and fix the other places in exceptions.c where similar
shortcuts are taken without checking, if there's agreement on the approach.
|
msg95238 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2009-11-14 12:29 |
Note that on Py2.6, when, for example, a string is assigned to u.start
and u.end a TypeError is raised, and the value is then set to -1:
>>> u=UnicodeTranslateError(u'x', 1, 5, 'bah')
>>> u.start = 'foo'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: an integer is required
>>> u.end = 'bar'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: an integer is required
>>> str(u)
"can't translate characters in position -1--2: bah"
>>> u.start, u.end
(-1, -1)
Is it possible to change the values assigning an int (or even a float
that is then converted to int).
On py3k the behavior is different; as Trundle said, it segfaults easily,
and trying to change the value of u.start and u.end returns a different
error:
>>> u.start = 'foo'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
SystemError: Objects/longobject.c:441: bad argument to internal function
Also note that on both the versions there's no check on these values
either, it's easy to have a segfault doing this:
>>> u = UnicodeTranslateError(u'x', 1, 5, 'bah')
>>> u.start = 2**30
>>> u.end = 2**30+1
>>> str(u)
(if the char is only one, Python will try to read it and display it)
|
msg95251 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-14 18:45 |
The patch that is (hopefully) attached is a first, incomplete cut just
for demonstration purposes. I still need to cover all of the cases where
PyString_AS_STRING are called without type checking. Also, as Ezio
points out, start and end are used to index an array without type
checking. I'll fix that as well.
The patch is against trunk.
|
msg95252 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2009-11-14 18:57 |
The same problem (u.start and u.end) also affects the other UnicodeError
exceptions (namely UnicodeEncodeError and UnicodeDecodeError).
Py2.4 and 2.5 don't seem to segfault with the example I provided.
|
msg95261 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-14 22:19 |
Another patch against trunk which deals with:
UnicodeEncodeError: reason and encoding
UnicodeDecodeError: reason and encoding
UnicodeTranslateError: reason
Still needs tests. Also, the unchecked use of start and end needs to be
addressed. I'm working on that.
|
msg95265 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-14 22:55 |
Tests need to cover issues like:
# assigning a non-string to e.object
e = UnicodeDecodeError("", "", 0, 1, "")
e.object = None
print str(e)
# start and end out of range
e = UnicodeDecodeError("", "", 0, 1, "")
e.start = 1000
e.end = 1001
print str(e)
For all cases of UnicodeXXXError with start and end, the code has a
special case for end = start+1. Invalid start/end tests need to have
end==start+1, end>start+1, end<start+1.
I'm not sure what the functions should do when start and end are out of
range.
|
msg95277 - (view) |
Author: Ezio Melotti (ezio.melotti) * |
Date: 2009-11-15 08:49 |
> I'm not sure what the functions should do when start and end are
> out of range.
I think the best approach would be to prevent these values to be out of
range in the first place.
All the args should be checked when the instance is created (to avoid
things like UnicodeTranslateError(None, 2**30, 2**30+1, 'bah')) and
then, if possible, the attributes should be set as read-only.
I don't see any valid reason to change them anyway.
|
msg95309 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-15 21:28 |
I agree there's not much value in making the attributes read/write, but
it looks like all of the exceptions allow it, so I don't really want to
make these exceptions the only ones that are different.
|
msg95335 - (view) |
Author: Walter Dörwald (doerwalter) * |
Date: 2009-11-16 09:49 |
>> I'm not sure what the functions should do when start and end are
>> out of range.
>
> I think the best approach would be to prevent these values to be out of
> range in the first place.
The start and end values should be clipped, just like normal slices in
Python do:
>>> ""[2**30:2**30+1]
''
> I agree there's not much value in making the attributes read/write,
> but it looks like all of the exceptions allow it, so I don't really
> want to make these exceptions the only ones that are different.
Exception attributes *must* be read/write, because the codecs create an
exception object once, and then uses this exception object to
communicate multiple errors to the callback. PEP 293 states: "Should
further encoding errors occur, the encoder is allowed to reuse the
exception object for the next call to the callback."
|
msg95336 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2009-11-16 09:52 |
Thanks, Walter. I'll finish my patch, then.
|
msg100034 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-02-24 14:28 |
Fixed:
trunk: r78418
release26-maint: r78419
Still working on porting to py3k and release31-maint.
|
msg100037 - (view) |
Author: Walter Dörwald (doerwalter) * |
Date: 2010-02-24 15:22 |
On 24.02.10 15:28, Eric Smith wrote:
> Eric Smith <eric@trueblade.com> added the comment:
>
> Fixed:
>
> trunk: r78418
> release26-maint: r78419
>
> Still working on porting to py3k and release31-maint.
A much better solution would IMHO be to forbid setting the encoding,
object and reason attributes to objects of the wrong type in the first
place. Unfortunately this would require an extension to PyMemberDef for
the T_OBJECT and T_OBJECT_EX types.
|
msg100038 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-02-24 15:24 |
> A much better solution would IMHO be to forbid setting the encoding,
> object and reason attributes to objects of the wrong type in the first
> place. Unfortunately this would require an extension to PyMemberDef for
> the T_OBJECT and T_OBJECT_EX types.
Agreed that's a better approach. But I wanted to get the fix in for 2.6
and 3.1.
You can open another issue if you'd like. I'm going to close this one as
soon as I get the crash fixed.
|
msg100040 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-02-24 15:54 |
Fixed:
py3k: r78420
release31-maint: r78421
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:54 | admin | set | github: 51558 |
2010-02-24 15:54:26 | eric.smith | set | status: open -> closed resolution: fixed messages:
+ msg100040
stage: patch review -> resolved |
2010-02-24 15:24:38 | eric.smith | set | messages:
+ msg100038 |
2010-02-24 15:22:03 | doerwalter | set | messages:
+ msg100037 |
2010-02-24 14:28:04 | eric.smith | set | messages:
+ msg100034 |
2009-11-16 09:52:50 | eric.smith | set | messages:
+ msg95336 |
2009-11-16 09:49:56 | doerwalter | set | nosy:
+ doerwalter messages:
+ msg95335
|
2009-11-15 21:28:24 | eric.smith | set | messages:
+ msg95309 |
2009-11-15 08:49:40 | ezio.melotti | set | messages:
+ msg95277 |
2009-11-14 22:55:14 | eric.smith | set | messages:
+ msg95265 |
2009-11-14 22:25:47 | eric.smith | set | files:
- issue7309.patch |
2009-11-14 22:25:41 | eric.smith | set | files:
+ issue7309-1.patch |
2009-11-14 22:20:09 | eric.smith | set | files:
- issue7309.patch |
2009-11-14 22:19:55 | eric.smith | set | files:
+ issue7309.patch
messages:
+ msg95261 |
2009-11-14 18:57:53 | ezio.melotti | set | messages:
+ msg95252 |
2009-11-14 18:45:41 | eric.smith | set | messages:
- msg95235 |
2009-11-14 18:45:32 | eric.smith | set | files:
+ issue7309.patch keywords:
+ patch messages:
+ msg95251
|
2009-11-14 12:29:06 | ezio.melotti | set | messages:
+ msg95238 |
2009-11-14 11:04:48 | eric.smith | set | messages:
+ msg95235 |
2009-11-14 10:59:35 | eric.smith | set | messages:
- msg95234 |
2009-11-14 10:59:29 | eric.smith | set | messages:
- msg95233 |
2009-11-14 10:56:40 | eric.smith | set | messages:
+ msg95234 |
2009-11-14 10:54:37 | eric.smith | set | messages:
+ msg95233 |
2009-11-14 10:49:20 | eric.smith | set | priority: low -> high
type: crash assignee: eric.smith versions:
+ Python 2.6, Python 3.2 nosy:
+ eric.smith
messages:
+ msg95232 stage: test needed -> patch review |
2009-11-14 10:18:01 | Trundle | set | nosy:
+ Trundle
messages:
+ msg95230 versions:
+ Python 3.1 |
2009-11-14 06:24:32 | ezio.melotti | set | messages:
+ msg95228 |
2009-11-14 02:06:44 | ezio.melotti | set | priority: low
nosy:
+ ezio.melotti messages:
+ msg95225
stage: test needed |
2009-11-12 10:50:48 | arigo | create | |