This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Check that _PyUnicode_AsString() result is not NULL
Type: crash Stage: commit review
Components: Interpreter Core, macOS, Unicode Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: Arfrever, amaury.forgeotdarc, belopolsky, ezio.melotti, jafo, lemburg, loewis, python-dev, ronaldoussoren, vstinner
Priority: high Keywords: needs review, patch

Created on 2009-08-13 20:07 by Arfrever, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.py Arfrever, 2009-08-13 20:07 test.py
invalid_utf8_characters_from_command_line.py Arfrever, 2009-08-13 20:53 crashers/invalid_utf8_characters_from_command_line.py
issue6697.diff belopolsky, 2010-12-06 18:15 review
issue6697a.diff belopolsky, 2010-12-07 17:19 review
issue6697b.diff belopolsky, 2010-12-07 22:04 review
issue6697-lsprof.diff belopolsky, 2011-01-13 23:05 review
Messages (48)
msg91533 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2009-08-13 20:07
Python 3.1 segfaults when invalid UTF-8 characters are passed from
command line.

In BASH shell you can run:
$ python3.1 -c $'print("\x80")'
Segmentation fault

In other POSIX-compatible shells you can save the attached test.py
files in current directory and run:
$ python3.1 -c "$(<test.py)"
Segmentation fault
msg91534 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2009-08-13 20:49
I'm attaching crashers/invalid_utf8_characters_from_command_line.py.
msg91576 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-08-14 22:39
The error occurs in Py_Main(), on _PyUnicode_AsString(commandObj). The
problem is that _PyUnicode_AsString() is not checked for error. Here is
a patch fixing two errors:
 - display on error message instead of a crash on
_PyUnicode_AsString(commandObj) failure
 - don't call Py_DECREF(commandObj) if commandObj is NULL
(PyUnicode_FromWideChar error, a different error)

My patch also includes a test. The test should be moved somewhere else,
but I don't know where.
msg91727 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-08-19 12:33
The problem is actually wider::
    >>> getattr(None, "\udc80")
    Segmentation fault
An idea would be to change _PyUnicode_AsDefaultEncodedString and allow
unpaired surrogates (utf8+surrogateescape, as explained in PEP383), but
I fear the consequences...

The code that fails seems pretty common:
	PyErr_Format(PyExc_AttributeError,
		     "'%.50s' object has no attribute '%.400s'",
		     tp->tp_name, _PyUnicode_AsString(name));
It would be unfortunate to replace all usages of _PyUnicode_AsString to
check the return value.

Martin, what do you think?
msg91728 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-08-19 12:49
Amaury Forgeot d'Arc wrote:
> 
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
> 
> The problem is actually wider::
>     >>> getattr(None, "\udc80")
>     Segmentation fault
> An idea would be to change _PyUnicode_AsDefaultEncodedString and allow
> unpaired surrogates (utf8+surrogateescape, as explained in PEP383), but
> I fear the consequences...
>
> The code that fails seems pretty common:
> 	PyErr_Format(PyExc_AttributeError,
> 		     "'%.50s' object has no attribute '%.400s'",
> 		     tp->tp_name, _PyUnicode_AsString(name));
> It would be unfortunate to replace all usages of _PyUnicode_AsString to
> check the return value.

The use of _PyUnicode_AsString() is wrong here. There are several
cases where it can fail, e.g. MemoryErrors, embedded NULs, encoding
errors.

The same is true for _PyUnicode_AsStringAndSize(), which is why
I turned them into Python interpreter private APIs before 3.0
shipped.

If you want a fail-safe stringified version of a Unicode object,
your only choice is to create a new API that does error checking,
properly clears the error and then returns a reference to a constant
string, e.g. "<repr-error>".
msg91730 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-08-19 13:31
Do you suggest to remove all usages of _PyUnicode_AsString() and
_PyUnicode_AsStringAndSize()?
msg91731 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2009-08-19 14:02
Amaury Forgeot d'Arc wrote:
> 
> Amaury Forgeot d'Arc <amauryfa@gmail.com> added the comment:
> 
> Do you suggest to remove all usages of _PyUnicode_AsString() and
> _PyUnicode_AsStringAndSize()?

In the short-term, I suggest that all uses that do not check the
return value get replaced with a new API which implements a failsafe
return value strategy.

In the mid- to long-term, the APIs should probably be removed
altogether.

They look a lot like the PyString APIs using the same names, but unlike
those APIs, they can fail, so the implied straight-forward conversion
of the PyString APIs to the above APIs gives a wrong impression to the
developers.

For error messages, I'd use the repr() of the objects - lone UTF-8
surrogates will not work since they cause issues further down the line
with debugging tools or even stderr terminal displays.
msg91732 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-08-19 14:45
The %U format seems adequate for this purpose - actually
PyObject_GenericSetAttr uses it already.

Yes, the exception message will contain the same lone UTF-8
surrogates; this is not a problem because sys.stderr uses the
"backslashreplace" error handler.
msg91742 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-08-19 20:20
> It would be unfortunate to replace all usages of _PyUnicode_AsString to
> check the return value.

I agree with MAL: we do need to check for errors returned from
_PyUnicode_AsString, and it would be best if we created a fail-safe
version of it.

In the specific case (getattr), it might also be useful to create a
result that is unicode-escaped, i.e. with \u escapes for all non-ASCII
non-printable characters.

For _PyUnicode_AsString, I'm uncertain whether supporting half
surrogates is a good idea. Unless there is a compelling reason to
support them, I think we leave that as-is. Your example is not
compelling: I think the unicode string should be escaped, anyway.

The OP's case is also not compelling, we should print an error
message that the source code is incorrectly encoded.
msg100445 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-05 01:02
Here is a fix for object.c (object_pyunicode_asstring-py3k.patch):

- PyObject_GenericGetAttr(): Replace PyErr_Format("... %.400s", ..., _PyUnicode_AsString(name)) by PyErr_Format("... %U", ..., name), as done in PyObject_GenericSetAttr(). Note that the string will no more be truncated to 400 bytes
- PyObject_GetAttr(), PyObject_SetAttr(): Catch _PyUnicode_AsString() error

It fixes the crash getattr(1, "\uDAD1\uD51E") (used as a test in the patch).
msg100446 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-05 01:11
Fix for _ssl module: replace _PyUnicode_AsString() by PyArg_ParseTuple() with PyUnicode_FSConverter. This change fixes also ssl for file system encoding different than utf8. I added a test on surrogates.

The test fails if surrogates can be encoded to the file system encoding (maybe on Windows?).
msg100468 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-05 12:13
MaL> If you want a fail-safe stringified version of a Unicode object,
MaL> your only choice is to create a new API that does error checking,
MaL> properly clears the error and then returns a reference to a constant
MaL> string, e.g. "<repr-error>".

I wrote a function _PyUnicode_AsStringOrDefault(unicode, default_str) which call _PyUnicode_AsStringAndSize() and return the default_str on error. It can be used in error handler (places where you don't really like to reraise new error) or if you don't care about (unicode) errors.
msg100474 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-05 12:56
Patch for pythonrun.c:
 - catch _PyUnicode_AsString() error in get_codeset(): very unlikely, codeset is the result of nl_langinfo() and is ASCII only
 - catch _PyUnicode_AsString(sys.stdin.encoding) error in PyRun_InteractiveOneFlags()
 - use _PyUnicode_AsStringOrDefault() for ps1 and ps2: use ps1="" and/or ps2="" on unicode error. I don't know if it's the best option. Display the error is maybe a better idea.

It's possible to raise to test the error on sys.stdin.encoding by adding the following lines to site.py:

class Stdin: pass
sys.stdin = Stdin()
sys.stdin = "\xdc80"

See also #8070: PyRun_InteractiveLoopFlags() doesn't handle errors :-/
msg100476 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-03-05 13:00
About the _PyUnicode_AsStringOrDefault() patch:

Since the _PyUnicode_AsString*() APIs are scheduled to be removed, it would be better to not introduce yet another way to use them.

If that's not easily possible now, then please fix the indentation (Tabs vs. spaces) before adding the API.

Thanks.
msg100680 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-08 23:23
See also issue #8092.
msg100683 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-08 23:36
unicode_fromformat_U.patch: replace PyUnicode_FromFormat("..%s...", ..., _PyUnicode_AsString(obj)) by PyUnicode_FromFormat("...%U...", ..., obj). It replaces also "%.200s" by "%U", so the output is no more truncated.
msg100946 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-12 17:01
I commited unicode_fromformat_U.patch as r78875.
msg100947 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-12 17:10
Oh, my ssl_rand_egd_unicode-py3k.patch is complelty broken! It writes a pointer to an object into the "char* path" variable :-/
msg100948 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-12 17:18
object_pyunicode_asstring-py3k.patch commited as r78876.
msg101458 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-21 21:09
> I commited unicode_fromformat_U.patch as r78875.
> object_pyunicode_asstring-py3k.patch commited as r78876.

Backported as r79240 and r79241 to 3.1.
msg105892 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-17 00:47
I fixed ssl.RAND_egd() in r81239 (issue #8477).

Remove the other commited patches to see quickly which patches remain.
msg105893 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-17 01:15
pymain.patch commited as r81250. Wait for the buildbot before backporting it to 3.1.
msg106016 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 00:04
r81314 fixes 2 calls in _PyModule_Clear().
msg106019 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 00:55
r81320 fixes a call in vgetargskeywords() (PyArg_ParseTupleAndKeywords).
msg106020 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 01:08
r81321 fixes 2 calls in builtin_input() (if sys.stdin or sys.stdout encoding contain a surrogate: this is *very* unlikely :-)).
msg106021 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 01:17
r81322 fixes 2 calls in textio.c.
msg106022 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 01:27
r81323 fixes 4 calls in _sqlite.
msg106023 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 01:43
r81324 fixes 2 calls in typeobject.c
msg106024 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-19 02:00
Remove pyunicode_asstringordefault.patch and pythonrun-py3k.patch because the new _PyUnicode_AsStringOrDefault() function was rejected (and it's easy to avoid it).
msg106058 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2010-05-19 13:04
r81319 fixed 4 calls in pythonrun.c.
msg123473 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-06 15:58
What is the status of this issue?  A grep for _PyUnicode_AsString quickly revealed a crash:

>>> from xml.etree.cElementTree import *
>>> e = Element('a')
>>> getattr(e, '\uD800')
Segmentation fault

I don't think this is the only one.
msg123474 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-06 16:09
Another crash:

>>> from datetime import *
>>> datetime.now(timezone(timedelta(0), '\uD800')).strftime('%Z')
Segmentation fault
msg123479 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-06 16:52
One of the uses of problematic uses of PyUnicode_GetSize() is in Macintosh Gestalt interface:



/* Convert a 4-char string object argument to an OSType value */
static int
convert_to_OSType(PyObject *v, OSType *pr)
{
    uint32_t tmp;
    if (!PyUnicode_Check(v) || PyUnicode_GetSize(v) != 4) {
    PyErr_SetString(PyExc_TypeError,
                    "OSType arg must be string of 4 chars");
    return 0;
    }
    memcpy((char *)&tmp, _PyUnicode_AsString(v), 4);
    *pr = (OSType)ntohl(tmp);
    return 1;
}

(Modules/_gestalt.c:41)

This function seems to require a bytes, not str argument as interpreting 4 UTF-8 bytes as an int makes little sense.
msg123482 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-06 18:15
I am attaching a patch that fixes several instances of unchecked _PyUnicode_AsString() result.  Not all fixes are completely trivial, so I would appreciate a review.

I did not attempt to fix Modules/_gestalt.c because I would like to hear from Ronald first.  (See my previous comment.)

The patch doe not have the unit tests yet, but I reported some test cases above and these should be easy to convert to unit tests.
msg123566 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-07 17:19
I am attaching a revised version of the patch which also includes some tests.  Interestingly, the issue in syslog module is a regression from 3.1 introduced in r80401.  Although it is not a crasher, I don't think it was intentional because although openlog() is happy to accept NULL for indent, the error from  _PyUnicode_AsString() would have to be cleared if the intent was to ignore undecodable indent.
msg123568 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-07 17:44
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> I am attaching a revised version of the patch which also includes some tests.  Interestingly, the issue in syslog module is a regression from 3.1 introduced in r80401.  Although it is not a crasher, I don't think it was intentional because although openlog() is happy to accept NULL for indent, the error from  _PyUnicode_AsString() would have to be cleared if the intent was to ignore undecodable indent.

Some notes:

* Rather than just patching in error handling code, please consider
removing use of those APIs and replace their calls with something
more appropriate, e.g. using a parser API.

* When ignoring errors from the API, you have to clear the exception.
This is missing in a couple of places in the patch, e.g. in pyexpat.c

* Please also remove hacks like these:

+#define CMP PyUnicode_CompareWithASCIIString
+        if (CMP(nameobj, "entity") == 0)
+            res = self->entity;
+        else if (CMP(nameobj, "target") == 0)
+            res = self->target;
+        else if (CMP(nameobj, "version") == 0) {
+            return PyUnicode_FromFormat(
+                "Expat %d.%d.%d", XML_MAJOR_VERSION,
                 XML_MINOR_VERSION, XML_MICRO_VERSION);
msg123573 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-07 18:06
On Tue, Dec 7, 2010 at 12:44 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
..
> * Rather than just patching in error handling code, please consider
> removing use of those APIs and replace their calls with something
> more appropriate, e.g. using a parser API.
>
Yes, that's what I started doing in the "a" patch.  I am not sure what
you mean by "a parser API."  There are several places where conversion
is either unnecessary or an encoded string is already available.  See
_elementtree.c.

> * When ignoring errors from the API, you have to clear the exception.
> This is missing in a couple of places in the patch, e.g. in pyexpat.c
>

Right.  On the other hand, this is very similar to xmlparser_getattro
in _elementtree.c and I think should be handled the same way.

> * Please also remove hacks like these:
>
> +#define CMP PyUnicode_CompareWithASCIIString
> +        if (CMP(nameobj, "entity") == 0)

What do you consider a hack?  The use of
PyUnicode_CompareWithASCIIString() or the shortening macro?
msg123574 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-07 18:11
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> On Tue, Dec 7, 2010 at 12:44 PM, Marc-Andre Lemburg
> <report@bugs.python.org> wrote:
> ..
>> * Rather than just patching in error handling code, please consider
>> removing use of those APIs and replace their calls with something
>> more appropriate, e.g. using a parser API.
>>
> Yes, that's what I started doing in the "a" patch.  I am not sure what
> you mean by "a parser API." 

PyArg_Parse() et al. See the discussion earlier on this ticket.

> There are several places where conversion
> is either unnecessary or an encoded string is already available.  See
> _elementtree.c.

If the API is not needed at all, even better.

>> * When ignoring errors from the API, you have to clear the exception.
>> This is missing in a couple of places in the patch, e.g. in pyexpat.c
>>
> 
> Right.  On the other hand, this is very similar to xmlparser_getattro
> in _elementtree.c and I think should be handled the same way.

Not sure what you mean here. If you ignore errors and don't clear
the exception, it will pop up at some later point in processing
and that's generally very confusing.

>> * Please also remove hacks like these:
>>
>> +#define CMP PyUnicode_CompareWithASCIIString
>> +        if (CMP(nameobj, "entity") == 0)
> 
> What do you consider a hack?  The use of
> PyUnicode_CompareWithASCIIString() or the shortening macro?

The shortening macro.
msg123575 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-07 18:21
On Tue, Dec 7, 2010 at 1:11 PM, Marc-Andre Lemburg
<report@bugs.python.org> wrote:
>>  I am not sure what
>> you mean by "a parser API."
>
> PyArg_Parse() et al. See the discussion earlier on this ticket.
>

I've just realized that.  It is the "u#" code.  Yes, I'll see if I can
use it instead of "U", but I think in the affected code the PyUnicode
object is needed as well.

>> ..  this is very similar to xmlparser_getattro
>> in _elementtree.c and I think should be handled the same way.
>
> Not sure what you mean here. If you ignore errors and don't clear
> the exception, it will pop up at some later point in processing
> and that's generally very confusing.
>

I mean not converting to char* at all and use
PyUnicode_CompareWithASCIIString() instead of strcmp().  I wish that
function had a shorter name, though, but it is not a big deal to spell
it out.  BTW, I don't think there is a way to use wchar_t* literals in
Python code, right?  As in Py_UNICODE_strcmp(name, L"version").
msg123577 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-07 19:02
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> On Tue, Dec 7, 2010 at 1:11 PM, Marc-Andre Lemburg
> <report@bugs.python.org> wrote:
>>>  I am not sure what
>>> you mean by "a parser API."
>>
>> PyArg_Parse() et al. See the discussion earlier on this ticket.
>>
> 
> I've just realized that.  It is the "u#" code.  Yes, I'll see if I can
> use it instead of "U", but I think in the affected code the PyUnicode
> object is needed as well.
> 
>>> ..  this is very similar to xmlparser_getattro
>>> in _elementtree.c and I think should be handled the same way.
>>
>> Not sure what you mean here. If you ignore errors and don't clear
>> the exception, it will pop up at some later point in processing
>> and that's generally very confusing.
>>
> 
> I mean not converting to char* at all and use
> PyUnicode_CompareWithASCIIString() instead of strcmp().  I wish that
> function had a shorter name, though, but it is not a big deal to spell
> it out. 

Agreed; not my invention ;-) I would have used
PyUnicode_CompareToUTF8() or something along those lines.

> BTW, I don't think there is a way to use wchar_t* literals in
> Python code, right?  As in Py_UNICODE_strcmp(name, L"version").

No, since wchar_t may be something completely different than
Py_UNICODE.

No sure about today's situation, but at least a couple of years
ago wchar_t was not defined on all supported platforms, e.g.
Crays didn't have it.
msg123582 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-07 22:04
issue6697b.diff addresses Marc's comments.  Thanks for the review.
msg123662 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-08 23:33
Committed revision 87137.  Needs backporting.  Also as Victor suggested, _lsprof.c code can be refactored to avoid roundtrips of unicode through utf8 char*.
msg123664 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-09 00:13
I am attaching an untested rewrite of normalizeUserObj() in _lsprof.c for comments on whether it is worth the effort.  There might be other places where PyModule_GetName() can be profitably replaced with PyModule_GetNameObject().
msg123767 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-11 02:35
issue6697-lsprof.diff:
 - Oh, I did recently a similar change on PyModule: I created PyModule_GetFilenameObject()
 - "PyObject * mod" => "PyObject *mod"
 - modname is not initialized if fn->m_module (mod) is NULL => initialize modname to NULL
 - there is a reference leak (modname)
msg126216 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2011-01-13 23:05
I am replacing issue6697-lsprof.diff with a (hopefully) more carefully written version that addresses the issues that Victor noted.

Victor,

I take your comment as +1 for adding PyModule_GetNameObject().


I started looking into adding unit tests that would exercise this code, but is does not seem possible to trigger the fn->m_self == NULL condition.  According to the comment in the code, this is supposed to be the case when fn is a builtin function, but I observe the following in the debugger when running test_cprofile:

(gdb) pyo fn
object  : <built-in function exec>
type    : builtin_function_or_method
refcount: 4
address : 0x10038c678
$5 = void
(gdb) pyo fn->m_self 
object  : <module 'builtins' (built-in)>
type    : module
refcount: 51
address : 0x100388ee8
$6 = void
msg126221 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-01-14 00:36
Le jeudi 13 janvier 2011 à 23:05 +0000, Alexander Belopolsky a écrit :
> I take your comment as +1 for adding PyModule_GetNameObject().

I wrote a similar patch to add PyModule_GetNameObject() (I am working on
another huge patch, to fix #3080). You have to document the new function
in Doc/c-api/module.rst.

Yes, it's better to work on unicode than encode unicode to bytes
(PyModule_GetName() with UTF-8) and then decode bytes from unicode
(PyUnicode_FromFormat with %s).

I am too tired to review the patch.
msg138705 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-06-20 13:01
New changeset b87eac0369b5 by Victor Stinner in branch 'default':
Issue #6697: _lsprof: normalizeUserObj() doesn't encode/decode (UTF-8) the
http://hg.python.org/cpython/rev/b87eac0369b5
msg138706 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-06-20 13:07
> I wrote a similar patch to add PyModule_GetNameObject() 
> (I am working on another huge patch, to fix #3080)

Issue #3080 added the PyModule_GetNameObject() function, so it simplify your patch.

I commited your issue6697-lsprof.diff patch, I just fixed a refleak (if modname is "builtins").

I want to close this generic issue. I think that we fixed enough code. If you still see code not checking that _PyUnicode_AsString() result is not NULL, please open a new specific issue.
History
Date User Action Args
2022-04-11 14:56:51adminsetgithub: 50946
2011-06-20 13:08:04vstinnersetstatus: open -> closed
resolution: fixed
2011-06-20 13:07:55vstinnersetmessages: + msg138706
2011-06-20 13:01:06python-devsetnosy: + python-dev
messages: + msg138705
2011-06-12 21:39:36terry.reedysetversions: + Python 3.3, - Python 3.1
2011-01-14 00:36:29vstinnersetnosy: lemburg, loewis, jafo, ronaldoussoren, amaury.forgeotdarc, belopolsky, vstinner, ezio.melotti, Arfrever
messages: + msg126221
2011-01-13 23:05:54belopolskysetfiles: - issue6697-lsprof.diff
nosy: lemburg, loewis, jafo, ronaldoussoren, amaury.forgeotdarc, belopolsky, vstinner, ezio.melotti, Arfrever
2011-01-13 23:05:28belopolskysetfiles: + issue6697-lsprof.diff
nosy: lemburg, loewis, jafo, ronaldoussoren, amaury.forgeotdarc, belopolsky, vstinner, ezio.melotti, Arfrever
messages: + msg126216
2010-12-11 02:35:49vstinnersetmessages: + msg123767
2010-12-09 00:13:31belopolskysetfiles: + issue6697-lsprof.diff

messages: + msg123664
2010-12-08 23:33:43belopolskysetmessages: + msg123662
2010-12-07 22:04:14belopolskysetfiles: + issue6697b.diff

messages: + msg123582
stage: patch review -> commit review
2010-12-07 19:02:25lemburgsetmessages: + msg123577
2010-12-07 18:21:11belopolskysetmessages: + msg123575
2010-12-07 18:11:06lemburgsetmessages: + msg123574
2010-12-07 18:06:07belopolskysetmessages: + msg123573
2010-12-07 17:44:20lemburgsetmessages: + msg123568
2010-12-07 17:19:42belopolskysetfiles: + issue6697a.diff
2010-12-07 17:19:22belopolskysetnosy: + jafo
messages: + msg123566
2010-12-06 18:16:00belopolskysetfiles: + issue6697.diff
messages: + msg123482

assignee: ronaldoussoren -> belopolsky
keywords: + needs review
stage: needs patch -> patch review
2010-12-06 16:52:28belopolskysetnosy: + ronaldoussoren
messages: + msg123479

assignee: ronaldoussoren
components: + macOS
2010-12-06 16:09:56belopolskysetmessages: + msg123474
2010-12-06 15:58:33belopolskysetnosy: + belopolsky
messages: + msg123473
2010-07-29 22:27:45vstinnerunlinkissue8242 dependencies
2010-05-19 13:04:42Arfreversetmessages: + msg106058
2010-05-19 02:00:19vstinnersetfiles: - pyunicode_asstringordefault.patch
2010-05-19 02:00:11vstinnersetmessages: + msg106024
2010-05-19 01:58:32vstinnersetfiles: - pythonrun-py3k.patch
2010-05-19 01:43:05vstinnersetmessages: + msg106023
2010-05-19 01:27:53vstinnersetmessages: + msg106022
2010-05-19 01:17:21vstinnersetmessages: + msg106021
2010-05-19 01:08:14vstinnersetmessages: + msg106020
2010-05-19 00:55:02vstinnersetmessages: + msg106019
2010-05-19 00:04:45vstinnersetmessages: + msg106016
2010-05-17 01:15:41vstinnersetfiles: - pymain.patch
2010-05-17 01:15:01vstinnersetmessages: + msg105893
2010-05-17 00:47:04vstinnersetmessages: + msg105892
2010-05-17 00:45:31vstinnersetfiles: - ssl_rand_egd_unicode-py3k.patch
2010-05-17 00:44:49vstinnersetfiles: - unicode_fromformat_U.patch
2010-05-17 00:44:30vstinnersetfiles: - object_pyunicode_asstring-py3k.patch
2010-04-23 21:00:33vstinnerlinkissue8242 dependencies
2010-03-21 21:09:22vstinnersetmessages: + msg101458
2010-03-12 17:18:22vstinnersetmessages: + msg100948
2010-03-12 17:10:39vstinnersetmessages: + msg100947
2010-03-12 17:01:02vstinnersetmessages: + msg100946
2010-03-08 23:36:45vstinnersetfiles: + unicode_fromformat_U.patch

messages: + msg100683
2010-03-08 23:23:12vstinnersetmessages: + msg100680
2010-03-05 13:00:33lemburgsetmessages: + msg100476
2010-03-05 12:56:16vstinnersetfiles: + pythonrun-py3k.patch

messages: + msg100474
2010-03-05 12:13:51vstinnersetfiles: + pyunicode_asstringordefault.patch

messages: + msg100468
2010-03-05 01:11:57vstinnersetfiles: + ssl_rand_egd_unicode-py3k.patch

messages: + msg100446
2010-03-05 01:02:02vstinnersetfiles: + object_pyunicode_asstring-py3k.patch

messages: + msg100445
title: Python 3.1 segfaults when invalid UTF-8 characters are passed from command line -> Check that _PyUnicode_AsString() result is not NULL
2009-10-03 22:17:20ezio.melottisetnosy: + ezio.melotti
2009-08-19 20:20:04loewissetmessages: + msg91742
2009-08-19 14:45:55amaury.forgeotdarcsetmessages: + msg91732
2009-08-19 14:02:12lemburgsetmessages: + msg91731
2009-08-19 13:31:13amaury.forgeotdarcsetmessages: + msg91730
2009-08-19 12:49:56lemburgsetnosy: + lemburg
title: Python 3.1 segfaults when invalid UTF-8 characters are passed from command line -> Python 3.1 segfaults when invalid UTF-8 characters are passed from command line
messages: + msg91728
2009-08-19 12:33:58amaury.forgeotdarcsetnosy: + amaury.forgeotdarc, loewis
messages: + msg91727
2009-08-14 22:39:32vstinnersetfiles: + pymain.patch

nosy: + vstinner
messages: + msg91576

keywords: + patch
2009-08-13 21:02:10r.david.murraysetstage: test needed -> needs patch
2009-08-13 20:53:21Arfreversetfiles: + invalid_utf8_characters_from_command_line.py
2009-08-13 20:52:43Arfreversetfiles: - invalid_utf8_characters_from_command_line.py
2009-08-13 20:49:24Arfreversetfiles: + invalid_utf8_characters_from_command_line.py

messages: + msg91534
2009-08-13 20:12:09r.david.murraysetcomponents: + Interpreter Core
2009-08-13 20:10:49r.david.murraysetpriority: high
type: crash
stage: test needed
2009-08-13 20:07:31Arfrevercreate