This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date 2011-08-17.10:08:08
SpamBayes Score 2.1344038e-13
Marked as misclassified No
Message-id <1313575689.69.0.748445535815.issue10542@psf.upfronthosting.co.za>
In-reply-to
Content
> For Python 2.7 and 3.2, I would prefer to not touch a public header, 
> and so add the macros in unicodeobject.c.

Is there some reason for this?  I think it's better if we have them in the same place rather than renaming and moving them in another file between 3.2 and 3.3.

> If you want to make my HIGH_SURROGATE and LOW_SURROGATE macros 
> public, they will use to substract 0x10000 themself (whereas my 
> macros require the ordinal to be preproceed).

If they turn out to be useful and we find a clearer name we can even make them public in 3.3, but we'll have to see about that.

> Note: I don't think that _Py_UNICODE*NEXT should go into
> Python 2.7 or 3.2.

If they don't it won't be possible to fix #9200 in those branches (unless we decide that the bug shouldn't be fixed there, but I would rather fix it).

> If you want to make it public, it's better to call it 
> PyUNICODE_IS_BMP() (check if the argument is in U+0000-U+FFFF).

Yes, public APIs will follow the naming conventions.  Not sure if it's better to check if it's a BMP char, or if it's not.

> They are still useful for UTF-16 encoders (to UTF-16-LE/BE and 16-bit 
> wchar_t*). We can keep HIGH_SURROGATE and LOW_SURROGATE private in 
> unicodeobject.c.

What are the naming convention for private macros in the same .c file where they are used?  Shouldn't they get at least a trailing _?

> Unless someone disagrees I'll prepare a patch with PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for unicodeobject.h, using them where necessary, using with Victor implementation and commit it (after a review).

> Cool. I suppose that you mean PyUNICODE_JOIN_SURROGATES (not 
> Py_UNICODE_JOIN_SURROGATES).

All the other macros use PyUNICODE_*.

> I used the verb "combine", taken from a  comment in unicodeobject.c. 
> "combine" is maybe better than "join"?

I like join, it's clear enough and shorter.
History
Date User Action Args
2011-08-17 10:08:09ezio.melottisetrecipients: + ezio.melotti, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, tchrist
2011-08-17 10:08:09ezio.melottisetmessageid: <1313575689.69.0.748445535815.issue10542@psf.upfronthosting.co.za>
2011-08-17 10:08:09ezio.melottilinkissue10542 messages
2011-08-17 10:08:08ezio.melotticreate