Message 122595 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	Rhamphoryncus, amaury.forgeotdarc, belopolsky, eric.smith, ezio.melotti, lemburg, loewis, pitrou, rhettinger, vstinner
Date	2010-11-28.01:39:32
SpamBayes Score	5.1749585e-06
Marked as misclassified	No
Message-id	<1290908373.84.0.554977255446.issue10542@psf.upfronthosting.co.za>
In-reply-to

Content
I am attaching a patch that defines Py_UNICODE_PUT_NEXT() macro (tentative name) and uses it to fix str.upper method. The implementation of surrogate-aware str.upper shows that NEXT/PUT_NEXT abstractions may lead to somewhat inefficient code for "by codepoint" processing. The issue is that once in in the process of reading the codepoint, it is determined whether the code point is BMP or non-BMP. Testing the result again in order to write it is somewhat wasteful. I don't think this would matter in practice, but would like to hear alternative opinions before moving further. (Please, don't argue over names - let's figure out the proper semantics first.)

I am attaching a patch that defines Py_UNICODE_PUT_NEXT() macro (tentative name) and uses it to fix str.upper method.  The implementation of surrogate-aware str.upper shows that NEXT/PUT_NEXT abstractions may lead to somewhat inefficient code for "by codepoint" processing.  The issue is that once in in the process of reading the codepoint, it is determined whether the code point is BMP or non-BMP.  Testing the result again in order to write it is somewhat wasteful.  I don't think this would matter in practice, but would like to hear alternative opinions before moving further. (Please, don't argue over names - let's figure out the proper semantics first.)

History
Date	User	Action	Args
2010-11-28 01:39:33	belopolsky	set	recipients: + belopolsky, lemburg, loewis, rhettinger, amaury.forgeotdarc, Rhamphoryncus, pitrou, vstinner, eric.smith, ezio.melotti
2010-11-28 01:39:33	belopolsky	set	messageid: <1290908373.84.0.554977255446.issue10542@psf.upfronthosting.co.za>
2010-11-28 01:39:32	belopolsky	link	issue10542 messages
2010-11-28 01:39:32	belopolsky	create