This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author belopolsky
Recipients belopolsky, eric.smith, pitrou
Date 2010-11-24.19:06:15
SpamBayes Score 3.024129e-08
Marked as misclassified No
Message-id <AANLkTi=_vgBhVBnt3+4DVt7xOPEruyA=GssoZcyL_Cjx@mail.gmail.com>
In-reply-to <1290612823.89.0.808626412832.issue10521@psf.upfronthosting.co.za>
Content
On Wed, Nov 24, 2010 at 10:33 AM, Antoine Pitrou <report@bugs.python.org> wrote:
..
> The question is, what should it do with such an input?

I think the rule for such functions should be that if
input.encode('utf-8') is the same on wide and narrow builds, then the
output.encode('utf-8') should be the same.

> Pretend it's a single char (but other chars in the source string won't get the same treatment)?

Yes, *and* surrogate pairs in the source string should count for one
char as well.

> Treat it as a two-char string (but then center() and friends should logically be
> extended to accept strings of arbitrary lengths)?

No.  For better or worse, on wide builds these methods effectively
operate on code points.  They don't interpret multi-code-point-
graphemes or take grapheme width into account:

--------------------
​​​​​​​​​​​​​​​​​123
--------------------

Application code has to ascertain that it is dealing with with fixed
width characters in the target font before using these methods for
text alignment.
History
Date User Action Args
2010-11-24 19:06:17belopolskysetrecipients: + belopolsky, pitrou, eric.smith
2010-11-24 19:06:15belopolskylinkissue10521 messages
2010-11-24 19:06:15belopolskycreate