Message 142188 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date	2011-08-16.12:11:21
SpamBayes Score	4.976669e-11
Marked as misclassified	No
Message-id	<4E4A5E64.3000909@egenix.com>
In-reply-to	<26743.1313492664@chthon>

Content
Tom Christiansen wrote: > So keeping your preamble bits, I might have considered doing it > this way if it were me doing it: > > #define _Py_UNICODE_IS_SURROGATE > #define _Py_UNICODE_IS_LEAD_SURROGATE > #define _Py_UNICODE_IS_TRAIL_SURROGATE > #define _Py_UNICODE_JOIN_SURROGATES > > But I also come from a culture that uses more underscores than you guys tend > to, as shown in some of the macro names shown below from utf8.h file. I find > that most projects use more underscores in uppercase names than Python does. :) The reasoning behind e.g. "ISSURROGATE" is that those names originate from and are consistent with the already existing ISLOWER/ISUPPER/ISTITLE macros which in return stem from the C APIs of the same names (see unicodeobject.h for reference). Regarding low/high vs. lead/trail: The Unicode database uses the terms low/high and we do in Python as well, so let's stick with those. What I don't understand is why those macros should be declared private to Python (with the leading underscore). They are quite useful for extensions implementing codecs or other transformations as well. BTW: I think the other issues mentioned in the discussion are more important to get right, than the names of those macros.

Tom Christiansen wrote:
> So keeping your preamble bits, I might have considered doing it
> this way if it were me doing it:
> 
>     #define _Py_UNICODE_IS_SURROGATE
>     #define _Py_UNICODE_IS_LEAD_SURROGATE
>     #define _Py_UNICODE_IS_TRAIL_SURROGATE
>     #define _Py_UNICODE_JOIN_SURROGATES
> 
> But I also come from a culture that uses more underscores than you guys tend 
> to, as shown in some of the macro names shown below from utf8.h file.  I find
> that most projects use more underscores in uppercase names than Python does. :)

The reasoning behind e.g. "ISSURROGATE" is that those names originate
from and are consistent with the already existing ISLOWER/ISUPPER/ISTITLE
macros which in return stem from the C APIs of the same names
(see unicodeobject.h for reference).

Regarding low/high vs. lead/trail: The Unicode database uses the
terms low/high and we do in Python as well, so let's stick with
those.

What I don't understand is why those macros should be declared
private to Python (with the leading underscore). They are quite
useful for extensions implementing codecs or other transformations
as well.

BTW: I think the other issues mentioned in the discussion are more
important to get right, than the names of those macros.

History
Date	User	Action	Args
2011-08-16 12:11:22	lemburg	set	recipients: + lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, ezio.melotti, tchrist
2011-08-16 12:11:21	lemburg	link	issue10542 messages
2011-08-16 12:11:21	lemburg	create