This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients PeterL, ezio.melotti
Date 2010-05-30.19:12:33
SpamBayes Score 1.5570284e-07
Marked as misclassified No
Message-id <1275246756.15.0.893458000522.issue8859@psf.upfronthosting.co.za>
In-reply-to
Content
Both on Linux and Windows I get:
>>> '\xa0'.isspace()
False
>>> u'\xa0'.isspace()
True

The Unicode char u'\xa0' is U+00A0 NO-BREAK SPACE, so unicode.split correctly considers it a whitespace.
However '\xa0' is not a whitespace, so str.split ignores it.
The correct solution is to convert your string to Unicode and then split.
I'd close this as invalid but I'd like you to confirm that the example I posted and that 'split' return the same result on both Linux and Windows before doing so (the fact that on Linux works it's probably caused by something else -- e.g. the label is already Unicode).
History
Date User Action Args
2010-05-30 19:12:36ezio.melottisetrecipients: + ezio.melotti, PeterL
2010-05-30 19:12:36ezio.melottisetmessageid: <1275246756.15.0.893458000522.issue8859@psf.upfronthosting.co.za>
2010-05-30 19:12:33ezio.melottilinkissue8859 messages
2010-05-30 19:12:33ezio.melotticreate