Message395730
Unfortunately this is not quite finished yet.
First of all, the change is bigger than what is documented: “Changed in version 3.9: Hyphens and spaces are converted to underscore.“
In reality, now
| Normalization works as follows: all non-alphanumeric
| characters except the dot used for Python package names are
| collapsed and replaced with a single underscore, e.g. ' -;#'
| becomes '_'. Leading and trailing underscores are removed.”
Cf. [encodings/__init__.py](https://github.com/python/cpython/blob/bb3e0c240bc60fe08d332ff5955d54197f79751c/Lib/encodings/__init__.py#L47-L50)
Secondly, this change breaks lots of iconv codecs with the python-iconv binding. E.g. `ASCII//TRANSLIT` is now normalized to `ascii_translit`, which iconv does not understand. Codec names which use hyphens also break and iinm not all of them have aliases in iconv without hyphens.
Cf. [python-iconv #4](https://github.com/bodograumann/python-iconv/issues/4)
The codecs api feels extremely well-fitting for integrating iconv in python and any alternative I can think of seems unsatisfactory.
Please advise. |
|
Date |
User |
Action |
Args |
2021-06-13 06:44:16 | bodograumann | set | recipients:
+ bodograumann, lemburg, vstinner, mark, ezio.melotti, methane, miss-islington, shihai1991, qigangxu, akdor1154 |
2021-06-13 06:44:16 | bodograumann | set | messageid: <1623566656.28.0.744559115484.issue37751@roundup.psfhosted.org> |
2021-06-13 06:44:16 | bodograumann | link | issue37751 messages |
2021-06-13 06:44:15 | bodograumann | create | |
|