This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients ezio.melotti, georg.brandl, lemburg, mgiuca, pitrou
Date 2010-08-02.12:03:56
SpamBayes Score 0.010617835
Marked as misclassified No
Message-id <4C56B428.7020609@egenix.com>
In-reply-to <1280742462.3384.3.camel@localhost.localdomain>
Content
Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>>  * Unicode objects are NUL-terminated, but only very external APIs
>>    rely on this (e.g. code using the Windows Unicode API). Please
>>    don't make the code in unicodeobject.c itself rely on this
>>    subtle detail.
> 
> That's wishful thinking, don't you think? *I* am not making code in
> unicodeobject.c rely on this. It has been so for years, long before I
> was here. You should check who made that design decision in the first
> place instead of putting the blame on me.

I'm not blaming you for this. However, I don't want more code
to rely on this behavior.

The NUL-termination has never been documented and my decision
to use NUL-termination on the PyUnicodeObject buffers was merely
a safety measure.

> Besides, the fact that external APIs rely on it make it much more
> unchangeable than if it were an implementation detail.

It's an undocumented implementation detail. We can certainly
deprecate it's use using the standard approach we have for this.

But all that is off-topic for this ticket, since codecs
operate on Py_UNICODE* buffers together with a size parameter
and relying on those buffers being NUL-terminated is bound to
cause problems.
History
Date User Action Args
2010-08-02 12:03:58lemburgsetrecipients: + lemburg, georg.brandl, pitrou, ezio.melotti, mgiuca
2010-08-02 12:03:57lemburglinkissue8821 messages
2010-08-02 12:03:56lemburgcreate