This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: "utf8" not always a synonym for "utf-8" in lib2to3
Type: behavior Stage: resolved
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Peter Ludemann, benjamin.peterson, ezio.melotti
Priority: normal Keywords:

Created on 2019-12-29 17:42 by Peter Ludemann, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (2)
msg358995 - (view) Author: Peter Ludemann (Peter Ludemann) Date: 2019-12-29 17:48
lib2to3.tokenize should allow 'utf8' and 'utf-8' interchangeably, to be consistent with the rest of the Python library (I looked through the library source, and there seems to be no consistent preference, and also many (but not all) checks for 'utf-8' also check for 'utf8'). In particular, tokenize.detect_encoding should have code for both forms, as the encoding can be set by the user. Also, code should allow for 'UTF8' and 'UTF-8'.

See also https://bugs.python.org/issue39154

(This is probably a larger issue than just lib2to3, as a quick grep through /usr/lib/python3.7 showed; but not sure how to best address that.)
msg359024 - (view) Author: Peter Ludemann (Peter Ludemann) Date: 2019-12-30 07:51
To clarify and fix a typo ... lib2to3.pgen2.tokenize.detect_encoding checks for 'utf-8'(and 'utf_8') but not 'utf8' in various places. Similarly for 'latin-1' and 'latin1'. (The codecs documentation page allows 'utf8' and 'latin1' as codecs.)

['UTF-8' is taken care of in _get_normal_name] 

See also https://bugs.python.org/issue39155
History
Date User Action Args
2022-04-11 14:59:24adminsetgithub: 83335
2021-10-20 23:06:02iritkatrielsetstatus: open -> closed
resolution: wont fix
stage: test needed -> resolved
2020-01-09 10:26:19vstinnersetnosy: - vstinner
2020-01-04 05:24:16terry.reedysetnosy: + benjamin.peterson
stage: test needed

versions: + Python 3.9, - Python 3.6
2019-12-30 07:51:30Peter Ludemannsetmessages: + msg359024
2019-12-30 07:38:47ned.deilysetmessages: - msg358997
2019-12-30 07:37:27ned.deilysetmessages: - msg358994
2019-12-29 18:15:26Peter Ludemannsetmessages: + msg358997
2019-12-29 17:48:02Peter Ludemannsetmessages: + msg358995
components: + 2to3 (2.x to 3.x conversion tool), - Unicode
title: "utf8-sig" missing from codecs (inconsistency) -> "utf8" not always a synonym for "utf-8" in lib2to3
2019-12-29 17:42:10Peter Ludemanncreate