Author belopolsky
Recipients Rhamphoryncus, amaury.forgeotdarc, belopolsky, doerwalter, eric.smith, ezio.melotti, georg.brandl, lemburg, loewis, pitrou, rhettinger, stutzbach, vstinner
Date 2010-12-29.19:26:15
SpamBayes Score 6.68965e-13
Marked as misclassified No
Message-id <AANLkTikiPRx62szpf3H6VnQ3ZqA++3fv9CH9h2HKVZu+@mail.gmail.com>
In-reply-to <1293640563.6.0.163107906681.issue10542@psf.upfronthosting.co.za>
Content
On Wed, Dec 29, 2010 at 11:36 AM, Georg Brandl <report@bugs.python.org> wrote:
..
> That bug already strikes me as quite exotic.
>
Would it look as exotic if presented like this?

  File "<stdin>", line 1
    𐌀 = 5
       ^
SyntaxError: invalid character in identifier
(works on a wide build)

Note that with few exceptions, pretty much anything you can do with
supplementary characters will produce different results in wide and
narrow builds.  This includes all character type methods (isalpha,
isdigit, etc.), transformations such as case folding or normalization,
text formatting, etc, etc.

When I suggested on python-dev that supplementary character support on
narrow builds is not worth violating fundamental invariants such as
len(chr(i)) == 1, pretty much everyone said that Python should support
full Unicode regardless of build.  When it comes to fixing specific
differences between builds, I hear that these differences are not
important because no one is using supplementary characters.

This example is less exotic than say str.center() or str.swapcase()
not because it involves less exotic characters - all non-BMP
characters are exotic by definition - but because it involves the core
definition of the Python language.
History
Date User Action Args
2010-12-29 19:26:19belopolskysetrecipients: + belopolsky, lemburg, loewis, doerwalter, georg.brandl, rhettinger, amaury.forgeotdarc, Rhamphoryncus, pitrou, vstinner, eric.smith, stutzbach, ezio.melotti
2010-12-29 19:26:15belopolskylinkissue10542 messages
2010-12-29 19:26:15belopolskycreate