Author lemburg
Recipients gvanrossum, lemburg, loewis, r.david.murray, scoder, stutzbach, vstinner, zooko
Date 2010-05-08.16:35:34
SpamBayes Score 1.0829e-05
Marked as misclassified No
Message-id <4BE592D4.9030807@egenix.com>
In-reply-to <m2reae285401005080848r8c64cb57m2b74b3001cbf1f06@mail.gmail.com>
Content
Daniel Stutzbach wrote:
> 
> Daniel Stutzbach <daniel@stutzbachenterprises.com> added the comment:
> 
> On Sat, May 8, 2010 at 10:16 AM, Marc-Andre Lemburg
> <report@bugs.python.org> wrote:
>> Are you sure this doesn't get optimized away in practice ?
> 
> I'm sure it doesn't get optimized away by gcc 4.3, where I tested it. :)
> 
>> Sure, though, I don't see how this relates to C code relying
>> on these details, e.g. a C extension will probably use different
>> conversion code depending on whether UCS2 or UCS4 is compatible
>> with some external library, etc.
> 
> Can you give an example?
> 
> All of the examples I can think of either:
> - poke into PyUnicodeObject's internals,
> - call a Python function that exposes Py_UNICODE or PyUnicodeObject
> 
> I'm explicitly trying to protect those two cases.  It's quite possible
> that I'm missing something, but I can't think of any other unsafe way
> for a C extension to convert a Python Unicode object to a byte string.

One of the more important cases you are missing is the
argument parser in Python:

Py_UNICODE *x;
Py_ssize_t y;
PyArg_ParseTuple(args, "u#", &x, &y);

This uses the native Py_UNICODE type, but doesn't rely on any
Unicode APIs.

Same for the tuple builder:

args = Py_BuildValue("(u#)", x, y);
History
Date User Action Args
2010-05-08 16:35:36lemburgsetrecipients: + lemburg, gvanrossum, loewis, zooko, scoder, vstinner, stutzbach, r.david.murray
2010-05-08 16:35:34lemburglinkissue8654 messages
2010-05-08 16:35:34lemburgcreate