Message91728
Amaury Forgeot d'Arc wrote:
>
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
>
> The problem is actually wider::
> >>> getattr(None, "\udc80")
> Segmentation fault
> An idea would be to change _PyUnicode_AsDefaultEncodedString and allow
> unpaired surrogates (utf8+surrogateescape, as explained in PEP383), but
> I fear the consequences...
>
> The code that fails seems pretty common:
> PyErr_Format(PyExc_AttributeError,
> "'%.50s' object has no attribute '%.400s'",
> tp->tp_name, _PyUnicode_AsString(name));
> It would be unfortunate to replace all usages of _PyUnicode_AsString to
> check the return value.
The use of _PyUnicode_AsString() is wrong here. There are several
cases where it can fail, e.g. MemoryErrors, embedded NULs, encoding
errors.
The same is true for _PyUnicode_AsStringAndSize(), which is why
I turned them into Python interpreter private APIs before 3.0
shipped.
If you want a fail-safe stringified version of a Unicode object,
your only choice is to create a new API that does error checking,
properly clears the error and then returns a reference to a constant
string, e.g. "<repr-error>". |
|
Date |
User |
Action |
Args |
2009-08-19 12:50:08 | lemburg | set | recipients:
+ lemburg, loewis, amaury.forgeotdarc, vstinner, Arfrever |
2009-08-19 12:49:56 | lemburg | link | issue6697 messages |
2009-08-19 12:49:56 | lemburg | create | |
|