This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients alexandre.vassalotti, bhy, lemburg, loewis
Date 2008-06-05.21:06:57
SpamBayes Score 0.009533206
Marked as misclassified No
Message-id <4848556E.5010207@egenix.com>
In-reply-to <4848517E.4060701@v.loewis.de>
Content
On 2008-06-05 22:50, Martin v. Löwis wrote:
>> Note that the function *must* check the UTF-8 buffer for embedded
>> NUL bytes and then raise an exception if it finds one. Otherwise,
>> the API would silently cause truncations.
> 
> PyString_AsString doesn't check for null bytes, either, and will also
> silently truncate. This has never been a problem, so I fail to see why
> it is a problem for Unicode strings.

Just because a bug hasn't surfaced yet, doesn't make it a non-issue.

The problem is also somewhat different for Unicode:

Unlike PyString_AsString() a Unicode API PyUnicode_UTF8() would not
provide easy access to the length of the returned char*.

And there is no PyString_GET_SIZE() you could use to quickly verify that
there are no embedded NULs.

Which is why using PyUnicode_AsStringAndSize() is the overall better
and safer solution.
History
Date User Action Args
2008-06-05 21:07:00lemburgsetspambayes_score: 0.00953321 -> 0.009533206
recipients: + lemburg, loewis, alexandre.vassalotti, bhy
2008-06-05 21:06:58lemburglinkissue2799 messages
2008-06-05 21:06:57lemburgcreate