This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, tchrist, vstinner
Date 2011-08-16.20:48:38
SpamBayes Score 1.5983698e-10
Marked as misclassified No
Message-id <1313527719.96.0.686337799751.issue10542@psf.upfronthosting.co.za>
In-reply-to
Content
I'm reposting my patch from #12751. I think that it's simpler than belopolsky's patch: it doesn't add public macros in unicodeobject.h and don't add the complex Py_UNICODE_NEXT() macro. My patch only adds private macros in unicodeobject.c to factorize the code.

I don't want to add public macros because with the stable API and with the PEP 393, we are trying to hide the Py_UNICODE type and PyUnicodeObject internals. In belopolsky's patch, only Py_UNICODE_NEXT() is used outside unicodeobject.c.

Copy/paste of the initial message of my issue #12751 (msg142108):
---------------
A lot of code is duplicated in unicodeobject.c to manipulate ("encode/decode") surrogates. Each function has from one to three different implementations. The new decode_ucs4() function adds a new implementation. Attached patch replaces this code by macros.

I think that only the implementations of IS_HIGH_SURROGATE and IS_LOW_SURROGATE are important for speed. ((ch & 0xFFFFFC00UL) == 0xD800) (from decode_ucs4) is *a little bit* faster than (0xD800 <= ch && ch <= 0xDBFF) on my CPU (Atom Z520 @ 1.3 GHz): running test_unicode 4 times takes ~54 sec instead of ~57 sec (-3%).

These 3 macros have to be checked, I wrote the first one:

#define IS_SURROGATE(ch) (((ch) & 0xFFFFF800UL) == 0xD800)
#define IS_HIGH_SURROGATE(ch) (((ch) & 0xFFFFFC00UL) == 0xD800)
#define IS_LOW_SURROGATE(ch) (((ch) & 0xFFFFFC00UL) == 0xDC00)

I added cast to Py_UCS4 in COMBINE_SURROGATES to avoid integer overflow if Py_UNICODE is 16 bits (narrow build). It's maybe useless.

#define COMBINE_SURROGATES(ch1, ch2) \
 (((((Py_UCS4)(ch1) & 0x3FF) << 10) | ((Py_UCS4)(ch2) & 0x3FF)) + 0x10000)

HIGH_SURROGATE and LOW_SURROGATE require that their ordinal argument has been preproceed to fit in [0; 0xFFFF]. I added this requirement in the comment of these macros. It would be better to have only one macro to do the two operations, but because "*p++" (dereference and increment) is usually used, I prefer to avoid one unique macro (I don't like passing *p++ in a macro using its argument more than once).

Or we may add a third macro using HIGH_SURROGATE and LOW_SURROGATE.

I rewrote the main loop of PyUnicode_EncodeUTF16() to avoid an useless test on ch2 on narrow build.

I also added a IS_NONBMP macro just because I prefer macro over hardcoded constants.
---------------
History
Date User Action Args
2011-08-16 20:48:40vstinnersetrecipients: + vstinner, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, belopolsky, Rhamphoryncus, pitrou, eric.smith, stutzbach, ezio.melotti, tchrist
2011-08-16 20:48:39vstinnersetmessageid: <1313527719.96.0.686337799751.issue10542@psf.upfronthosting.co.za>
2011-08-16 20:48:39vstinnerlinkissue10542 messages
2011-08-16 20:48:39vstinnercreate