This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date 2011-08-17.05:04:06
SpamBayes Score 4.0957454e-10
Marked as misclassified No
Message-id <1313557447.24.0.77234115752.issue10542@psf.upfronthosting.co.za>
In-reply-to
Content
As I said in msg142175 I think the Py_UNICODE_IS{HIGH|LOW|}SURROGATE and Py_UNICODE_JOIN_SURROGATES can be committed without trailing _ in 3.3 and with trailing _ in 2.7/3.2.  They should go in unicodeobject.h and be public in 3.3+.

Regarding the name, it would be fine with me to use PyUNICODE_IS_HIGH_SURROGATE.  Other IS* macros don't use spaces, but JOIN_SURROGATES and other proposed macros (e.g. PUT_NEXT/WRITE_NEXT) do.  Also these macros are not related to any existing API like e.g. isalpha.  I think HIGH/LOW are fine, we can mention lead/trail in the doc.

Regarding the implementation, we could use Victor's one if it's faster and it has no other side effects.

Regarding the other macros:
 * _Py_UNICODE_NEXT and _Py_UNICODE_PUT_NEXT are useful, so once we have agreed about the name they can go in.  They can be private in all the 3 branches and made public in 3.4 if they work well;
 * IS_NONBMP doesn't simplify much the code but makes it more readable.  ICU has U_IS_BMP, but in most of the cases we want to check for non-BMP, so if we add this macro it might be ok to check for non-BMP;
 * I'm not sure HIGH_SURROGATE/LOW_SURROGATE are useful with _Py_UNICODE_NEXT.  If they are they should get a better name because the current one is not clear about what they do.


Unless someone disagrees I'll prepare a patch with PyUNICODE_IS_{HIGH_|LOW_|}SURROGATE and Py_UNICODE_JOIN_SURROGATES for unicodeobject.h, using them where necessary, using with Victor implementation and commit it (after a review).

We can think about the rest later.
History
Date User Action Args
2011-08-17 05:04:07ezio.melottisetrecipients: + ezio.melotti, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, tchrist
2011-08-17 05:04:07ezio.melottisetmessageid: <1313557447.24.0.77234115752.issue10542@psf.upfronthosting.co.za>
2011-08-17 05:04:06ezio.melottilinkissue10542 messages
2011-08-17 05:04:06ezio.melotticreate