Message93240
> While
> the change would seem to always be helpful in an English context, in
> French the proper title casing of "l'argent" is "L'Argent".
Well I think even in English it doesn't work right.
For example someone named O'Brien would end up as "O'brien".
My point is that capitalization is both language-sensitive and
context-sensitive, and it's a hard problem for a computer to solve.
Since str.title() can only be a very crude approximation of the right
thing, there's no good reason to break backwards compatibility, IMO.
> 1. Leave everything the same (rejecting requests for apostrophe handling
> and forever live with the likes of You'Re).
>
> 2. Handle embedded single apostrophes, fixing most cases in English, and
> wreaking havoc on the French (who are going to be ill-served under any
> scenario).
>
> 3. Add an optional argument to str.title() with a list of characters
> that will not trigger a transition. This lets people add apostrophes
> and hyphens and other characters of interest. Hyphens are hard because
> cases like mother-in-law should properly be converted to Mother-in_Law
> and hyphens get used in many odd ways.
>
> 4. Add a new string method for handling title case with embedded
> apostrophes but leaving the old version unchanged.
>
> My order of preferences is 2,4,3,1.
I really think the only reasonable options are 3 and 1.
2 breaks compatibility with no real benefit.
4 is too specific a variation (especially in the unicode case, where you
might want to take into account the different variants of apostrophes
and other characters), and adding a new method for such a subtle
difference is not warranted. |
|
Date |
User |
Action |
Args |
2009-09-28 22:33:13 | pitrou | set | recipients:
+ pitrou, gvanrossum, nnorwitz, rhettinger, ezio.melotti, r.david.murray, markon, twb, nickd |
2009-09-28 22:33:12 | pitrou | link | issue7008 messages |
2009-09-28 22:33:12 | pitrou | create | |
|