This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author belopolsky
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, eric.smith, ezio.melotti, lemburg, loewis, pitrou, rhettinger, vstinner
Date 2010-11-28.01:39:32
SpamBayes Score 5.1749585e-06
Marked as misclassified No
Message-id <1290908373.84.0.554977255446.issue10542@psf.upfronthosting.co.za>
In-reply-to
Content
I am attaching a patch that defines Py_UNICODE_PUT_NEXT() macro (tentative name) and uses it to fix str.upper method.  The implementation of surrogate-aware str.upper shows that NEXT/PUT_NEXT abstractions may lead to somewhat inefficient code for "by codepoint" processing.  The issue is that once in in the process of reading the codepoint, it is determined whether the code point is BMP or non-BMP.  Testing the result again in order to write it is somewhat wasteful.  I don't think this would matter in practice, but would like to hear alternative opinions before moving further. (Please, don't argue over names - let's figure out the proper semantics first.)
History
Date User Action Args
2010-11-28 01:39:33belopolskysetrecipients: + belopolsky, lemburg, loewis, rhettinger, amaury.forgeotdarc, Rhamphoryncus, pitrou, vstinner, eric.smith, ezio.melotti
2010-11-28 01:39:33belopolskysetmessageid: <1290908373.84.0.554977255446.issue10542@psf.upfronthosting.co.za>
2010-11-28 01:39:32belopolskylinkissue10542 messages
2010-11-28 01:39:32belopolskycreate