classification
Title: textwrap should treat Unicode em-dash like ASCII em-dash
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jonathaneunice, r.david.murray
Priority: normal Keywords:

Created on 2017-06-15 19:09 by jonathaneunice, last changed 2017-06-15 20:10 by jonathaneunice.

Pull Requests
URL Status Linked Edit
PR 2224 open jonathaneunice, 2017-06-15 19:29
Messages (3)
msg296124 - (view) Author: Jonathan Eunice (jonathaneunice) * Date: 2017-06-15 19:09
The textwrap module goes to great lengths to "do the right thing" when it finds the ASCII simulation of an em-dash (two or more consecutive hyphens), but it does nothing to recognize and similarly treat true (Unicode) em-dashes (aka '\N{EM DASH}', '\u2014', or U+2014). Real em-dashes should get at least as good a treatment as simulated em-dashes.
msg296126 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-06-15 19:35
This seems sensible to me (I haven't looked at the PR, I'm talking about adding the support).  When textwrap was written python was pretty ascii oriented, so it is not too much of a surprise that unicode em dashes were not supported.
msg296127 - (view) Author: Jonathan Eunice (jonathaneunice) * Date: 2017-06-15 20:10
Agreed. It makes great sense that textwrap started as highly ASCII-centric. But in the Python 3, Unicode-friendly era, ASCII-biased isn't where we should leave things.
History
Date User Action Args
2017-06-15 20:10:35jonathaneunicesetmessages: + msg296127
2017-06-15 19:35:30r.david.murraysetnosy: + r.david.murray

messages: + msg296126
stage: patch review
2017-06-15 19:29:45jonathaneunicesetpull_requests: + pull_request2269
2017-06-15 19:09:00jonathaneunicecreate