Message 112447 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	ezio.melotti, georg.brandl, lemburg, mgiuca, pitrou
Date	2010-08-02.12:03:56
SpamBayes Score	0.010617835
Marked as misclassified	No
Message-id	<4C56B428.7020609@egenix.com>
In-reply-to	<1280742462.3384.3.camel@localhost.localdomain>

Content
Antoine Pitrou wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > >> * Unicode objects are NUL-terminated, but only very external APIs >> rely on this (e.g. code using the Windows Unicode API). Please >> don't make the code in unicodeobject.c itself rely on this >> subtle detail. > > That's wishful thinking, don't you think? I am not making code in > unicodeobject.c rely on this. It has been so for years, long before I > was here. You should check who made that design decision in the first > place instead of putting the blame on me. I'm not blaming you for this. However, I don't want more code to rely on this behavior. The NUL-termination has never been documented and my decision to use NUL-termination on the PyUnicodeObject buffers was merely a safety measure. > Besides, the fact that external APIs rely on it make it much more > unchangeable than if it were an implementation detail. It's an undocumented implementation detail. We can certainly deprecate it's use using the standard approach we have for this. But all that is off-topic for this ticket, since codecs operate on Py_UNICODE* buffers together with a size parameter and relying on those buffers being NUL-terminated is bound to cause problems.

Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>>  * Unicode objects are NUL-terminated, but only very external APIs
>>    rely on this (e.g. code using the Windows Unicode API). Please
>>    don't make the code in unicodeobject.c itself rely on this
>>    subtle detail.
> 
> That's wishful thinking, don't you think? *I* am not making code in
> unicodeobject.c rely on this. It has been so for years, long before I
> was here. You should check who made that design decision in the first
> place instead of putting the blame on me.

I'm not blaming you for this. However, I don't want more code
to rely on this behavior.

The NUL-termination has never been documented and my decision
to use NUL-termination on the PyUnicodeObject buffers was merely
a safety measure.

> Besides, the fact that external APIs rely on it make it much more
> unchangeable than if it were an implementation detail.

It's an undocumented implementation detail. We can certainly
deprecate it's use using the standard approach we have for this.

But all that is off-topic for this ticket, since codecs
operate on Py_UNICODE* buffers together with a size parameter
and relying on those buffers being NUL-terminated is bound to
cause problems.

History
Date	User	Action	Args
2010-08-02 12:03:58	lemburg	set	recipients: + lemburg, georg.brandl, pitrou, ezio.melotti, mgiuca
2010-08-02 12:03:57	lemburg	link	issue8821 messages
2010-08-02 12:03:56	lemburg	create