Author pusnow
Recipients ezio.melotti, haypo, pusnow
Date 2017-02-06.04:27:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1486355272.29.0.0894514518656.issue29456@psf.upfronthosting.co.za>
In-reply-to
Content
unicodedata can't normalize(NFC) hangul strings which contain \u1176(HANGUL JUNGSEONG A-O).

>>> from unicodedata import normalize
>>> normalize("NFC", "\u1100\u1176\u11a8")
'깍'

=> should be "\u1100\u1176\u11a8" not '깍' (\uae4d)

I attached a patch for this issue. (Fixing boundary of modern medial vowels)
History
Date User Action Args
2017-02-06 04:27:52pusnowsetrecipients: + pusnow, haypo, ezio.melotti
2017-02-06 04:27:52pusnowsetmessageid: <1486355272.29.0.0894514518656.issue29456@psf.upfronthosting.co.za>
2017-02-06 04:27:52pusnowlinkissue29456 messages
2017-02-06 04:27:51pusnowcreate