This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients christoph, ezio.melotti, gvanrossum, lemburg, markon, nickd, nnorwitz, pitrou, r.david.murray, rhettinger, twb
Date 2009-09-29.10:40:54
SpamBayes Score 5.7325438e-09
Marked as misclassified No
Message-id <4AC1E435.9030908@egenix.com>
In-reply-to <1254219647.05.0.244296326279.issue7008@psf.upfronthosting.co.za>
Content
Christoph Burgmer wrote:
> 
> Christoph Burgmer <cburgmer@ira.uka.de> added the comment:
> 
> I admit I don't fully understand the semantics of capwords().

string.capwords() is an old function from the days before Unicode.
The function is basically defined by its implementation.

> But from
> what I believe what it should do, this function could be happily
> replaced by the word-breaking algorithm as defined in
> http://www.unicode.org/reports/tr29/.
> 
> This algorithm should be implemented anyway, to properly solve
> issue6412.

Simple word breaking would be nice to have in Python as new
Unicode method, e.g. .splitwords().

Note however, that word boundaries are just as complicated as casing:
there are lots of special cases in different languages or locales
(see the notes after the word boundary rules in the TR29).
History
Date User Action Args
2009-09-29 10:40:56lemburgsetrecipients: + lemburg, gvanrossum, nnorwitz, rhettinger, pitrou, christoph, ezio.melotti, r.david.murray, markon, twb, nickd
2009-09-29 10:40:54lemburglinkissue7008 messages
2009-09-29 10:40:54lemburgcreate