Message123290
Alexander Belopolsky wrote:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> On Sat, Nov 27, 2010 at 6:38 PM, Raymond Hettinger
> <report@bugs.python.org> wrote:
> ..
>> I suggest Py_UNICODE_ADVANCE() to avoid false suggestion that the iterator protocol is being used.
>>
>
> As a data point, ICU defines U16_NEXT() for similar purpose. I also
> like ICU terminology for surrogates ("lead" and "trail") better than
> the backward "high" and "low".
"High" and "low" are Unicode standard terms, so we should use
those.
Regarding Py_UCS4_READ_CODE_POINT: you're right that surrogates
are code points, so how about Py_UCS4_READ_NEXT() ?!
Regarding Py_UCS4_READ_NEXT() vs. Py_UNICODE_READ_NEXT(): the return
value of the macro is a Py_UCS4 value, not a Py_UNICODE value. The
first argument of the macro can be any array, not just Py_UNICODE*,
but also Py_UCS4* or even int*.
Py_UCS2_READ_NEXT() would be plain wrong :-) Also note that Python
does have a Py_UCS4 type; it doesn't have a Py_UCS2 type.
That's why we should use *Py_UCS4*_READ_NEXT(). |
|
Date |
User |
Action |
Args |
2010-12-03 20:12:53 | lemburg | set | recipients:
+ lemburg, loewis, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, ezio.melotti |
2010-12-03 20:12:37 | lemburg | link | issue10542 messages |
2010-12-03 20:12:37 | lemburg | create | |
|