This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients Arfrever, ezio.melotti, gvanrossum, loewis, tchrist, terry.reedy, vstinner
Date 2011-09-18.08:45:52
SpamBayes Score 9.118775e-06
Marked as misclassified No
Message-id <1316335553.34.0.251282744288.issue12737@psf.upfronthosting.co.za>
In-reply-to
Content
Tom: it's intentional that .title() doesn't use traditional word break algorithms. In 2.x, "foo3bar".title() is "Foo3Bar", i.e. the 3 counts as a word end. So neither UTS#18 \w nor UAX#29 apply. So in UTS#18 terminology, .title() matches more closes \alpha+, despite UTS#18 saying that this shouldn't be used for word-breaking.

It's not clear to me how UTS#18 defines \alpha. On the one hand, they say that marks should be included, OTOH they refer to the Alphabetic derived category which doesn't include marks, except for the few that have been included in Other_Alphatetic.
History
Date User Action Args
2011-09-18 08:45:53loewissetrecipients: + loewis, gvanrossum, terry.reedy, vstinner, ezio.melotti, Arfrever, tchrist
2011-09-18 08:45:53loewissetmessageid: <1316335553.34.0.251282744288.issue12737@psf.upfronthosting.co.za>
2011-09-18 08:45:52loewislinkissue12737 messages
2011-09-18 08:45:52loewiscreate