Author ezio.melotti
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date 2011-08-17.10:08:08
SpamBayes Score 2.1344e-13
Marked as misclassified No
Message-id <>
> For Python 2.7 and 3.2, I would prefer to not touch a public header, 
> and so add the macros in unicodeobject.c.

Is there some reason for this?  I think it's better if we have them in the same place rather than renaming and moving them in another file between 3.2 and 3.3.

> If you want to make my HIGH_SURROGATE and LOW_SURROGATE macros 
> public, they will use to substract 0x10000 themself (whereas my 
> macros require the ordinal to be preproceed).

If they turn out to be useful and we find a clearer name we can even make them public in 3.3, but we'll have to see about that.

> Note: I don't think that _Py_UNICODE*NEXT should go into
> Python 2.7 or 3.2.

If they don't it won't be possible to fix #9200 in those branches (unless we decide that the bug shouldn't be fixed there, but I would rather fix it).

> If you want to make it public, it's better to call it 
> PyUNICODE_IS_BMP() (check if the argument is in U+0000-U+FFFF).

Yes, public APIs will follow the naming conventions.  Not sure if it's better to check if it's a BMP char, or if it's not.

> They are still useful for UTF-16 encoders (to UTF-16-LE/BE and 16-bit 
> wchar_t*). We can keep HIGH_SURROGATE and LOW_SURROGATE private in 
> unicodeobject.c.

What are the naming convention for private macros in the same .c file where they are used?  Shouldn't they get at least a trailing _?

> Unless someone disagrees I'll prepare a patch with PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for unicodeobject.h, using them where necessary, using with Victor implementation and commit it (after a review).

> Cool. I suppose that you mean PyUNICODE_JOIN_SURROGATES (not 

All the other macros use PyUNICODE_*.

> I used the verb "combine", taken from a  comment in unicodeobject.c. 
> "combine" is maybe better than "join"?

I like join, it's clear enough and shorter.
Date User Action Args
2011-08-17 10:08:09ezio.melottisetrecipients: + ezio.melotti, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, tchrist
2011-08-17 10:08:09ezio.melottisetmessageid: <>
2011-08-17 10:08:09ezio.melottilinkissue10542 messages
2011-08-17 10:08:08ezio.melotticreate