Author ezio.melotti
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date 2011-08-17.05:04:06
SpamBayes Score 4.09575e-10
Marked as misclassified No
Message-id <1313557447.24.0.77234115752.issue10542@psf.upfronthosting.co.za>
In-reply-to
Content
As I said in msg142175 I think the Py_UNICODE_IS{HIGH|LOW|}SURROGATE and Py_UNICODE_JOIN_SURROGATES can be committed without trailing _ in 3.3 and with trailing _ in 2.7/3.2.  They should go in unicodeobject.h and be public in 3.3+.

Regarding the name, it would be fine with me to use PyUNICODE_IS_HIGH_SURROGATE.  Other IS* macros don't use spaces, but JOIN_SURROGATES and other proposed macros (e.g. PUT_NEXT/WRITE_NEXT) do.  Also these macros are not related to any existing API like e.g. isalpha.  I think HIGH/LOW are fine, we can mention lead/trail in the doc.

Regarding the implementation, we could use Victor's one if it's faster and it has no other side effects.

Regarding the other macros:
 * _Py_UNICODE_NEXT and _Py_UNICODE_PUT_NEXT are useful, so once we have agreed about the name they can go in.  They can be private in all the 3 branches and made public in 3.4 if they work well;
 * IS_NONBMP doesn't simplify much the code but makes it more readable.  ICU has U_IS_BMP, but in most of the cases we want to check for non-BMP, so if we add this macro it might be ok to check for non-BMP;
 * I'm not sure HIGH_SURROGATE/LOW_SURROGATE are useful with _Py_UNICODE_NEXT.  If they are they should get a better name because the current one is not clear about what they do.


Unless someone disagrees I'll prepare a patch with PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for unicodeobject.h, using them where necessary, using with Victor implementation and commit it (after a review).

We can think about the rest later.
History
Date User Action Args
2011-08-17 05:04:07ezio.melottisetrecipients: + ezio.melotti, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, tchrist
2011-08-17 05:04:07ezio.melottisetmessageid: <1313557447.24.0.77234115752.issue10542@psf.upfronthosting.co.za>
2011-08-17 05:04:06ezio.melottilinkissue10542 messages
2011-08-17 05:04:06ezio.melotticreate