This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unicodedata.normalize failing with NFD and NFKD for some characters in Python3
Type: behavior Stage: resolved
Components: macOS, Unicode Versions: Python 3.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: Lee Collins, ezio.melotti, methane, ned.deily, ronaldoussoren, vstinner
Priority: normal Keywords:

Created on 2019-12-31 18:22 by Lee Collins, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg359120 - (view) Author: Lee Collins (Lee Collins) Date: 2019-12-31 18:22
A script that works in 2.7.17 is now failing for some Unicode characters in 3.7.5 on MacOS 10.14.6. For example unicodedata.normalize('NFD', 'à') used to return the correct decomposition u'a\u0300', but in 3.7 it returns the single composed character U+00E0. This doesn't happen for all composed forms, just some. Other examples: á, ã
msg359123 - (view) Author: Lee Collins (Lee Collins) Date: 2019-12-31 19:53
On further investigation, it appears that the problem is the interaction between Python3 and the MacOS terminal. unicodedata.normalize() produces the correct sequence u'a\u0300' but when printed it comes out as U+00E0
msg359321 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-01-05 03:36
I can not get what you say.  If "unicodedata.normalize() produces the correct sequence", isn't it just a your terminal behavior?

If you think it is Python's issue, could you be more specific and write a simple sample code?
msg359360 - (view) Author: Lee Collins (Lee Collins) Date: 2020-01-05 17:20
I did some more investigation by running cat on a file with the decomposed characters and saw that the output was composed. So, this does look like a problem with the Mac OS terminal. It can be resolved as 3rd party
History
Date User Action Args
2022-04-11 14:59:24adminsetgithub: 83355
2020-01-05 17:20:27Lee Collinssetstatus: open -> closed
type: behavior
messages: + msg359360

resolution: third party
stage: resolved
2020-01-05 03:36:10methanesetnosy: + methane
messages: + msg359321
2020-01-04 06:34:56terry.reedysetnosy: + ned.deily, ronaldoussoren
components: + macOS
2019-12-31 19:53:15Lee Collinssetmessages: + msg359123
2019-12-31 18:22:31Lee Collinscreate