Author Marc Richter
Recipients Marc Richter
Date 2018-10-08.09:49:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538992172.47.0.545547206417.issue34928@psf.upfronthosting.co.za>
In-reply-to
Content
There's a special letter in German orthography called "eszett" (ß). This letter had no uppercase variant for hundreds of years until 2017, there was an uppercase variant added to the official German orthography called "capital eszett" (ẞ) [1].

Python's .upper() string method still translates this to "SS" (which was correct before 2017):

~ $ python3.7.0
Python 3.7.0 (default, Aug 29 2018, 17:15:17) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 'gruß'.upper()
'GRUSS'
>>>

The result of this example should have been 'GRUẞ' instead.
That being said, it's fair to inform about the fact that this letter is still quite unpopular in Germany; it is not even typeable with German keyboards, yet. Anyways, I think since this became officials orthography, it's not Python's job to adopt behaviors but clear rules instead.

I'm not sure if this affects .casefold() as well, since I do not get that method's scope.

BR,
Marc Richter


[1]: https://en.wikipedia.org/wiki/Capital_%E1%BA%9E
History
Date User Action Args
2018-10-08 09:49:32Marc Richtersetrecipients: + Marc Richter
2018-10-08 09:49:32Marc Richtersetmessageid: <1538992172.47.0.545547206417.issue34928@psf.upfronthosting.co.za>
2018-10-08 09:49:32Marc Richterlinkissue34928 messages
2018-10-08 09:49:32Marc Richtercreate