This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients ezio.melotti, loewis, mankyd
Date 2011-11-14.21:33:05
SpamBayes Score 8.64614e-05
Marked as misclassified No
Message-id <1321306387.0.0.638362418668.issue13391@psf.upfronthosting.co.za>
In-reply-to
Content
> But why are they not a space?

Because the Unicode standard says they are not. We have a good tradition in Python to follow standards where they apply, and it appears that the Unicode standard is crystal clear that the characters in question are *not* white space. Why should we second-guess the Unicode consortium when discussing Unicode questions? See also

http://en.wikipedia.org/wiki/Whitespace_character

IOW: get the Unicode consortium to declare them as whitespace, and we happily follow.

Ezio: I do think that _PyUnicode_IsWhitespace should use the White_Space property (from PropList.txt). I'm not quite sure how they computed that property (or whether it's manually curated). Since that's a behavioral change, it can only go into 3.3.
History
Date User Action Args
2011-11-14 21:33:07loewissetrecipients: + loewis, ezio.melotti, mankyd
2011-11-14 21:33:06loewissetmessageid: <1321306387.0.0.638362418668.issue13391@psf.upfronthosting.co.za>
2011-11-14 21:33:05loewislinkissue13391 messages
2011-11-14 21:33:05loewiscreate