This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: decoding_fgets() (tokenizer.c) decodes the filename from the wrong encoding
Type: Stage:
Components: Interpreter Core, Unicode Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: vstinner
Priority: normal Keywords:

Created on 2010-12-27 01:57 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (4)
msg124693 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-27 01:57
decoding_fgets() decodes the input filename from UTF-8 whereas the filename is encoded to the filesystem encoding. PyUnicode_DecodeFSDefault() should be used.

decoding_fgets() raises a SyntaxError("Non-UTF-8 code starting with '\xHH' in file xxx on line xxx, but no encoding declared; ...").

indenterror() (inconsistent use of tabs and spaces in indentation) and
msg124695 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-27 02:06
See also issue #10779 (Change filename encoding to FS encoding in PyErr_WarnExplicit()).
msg124703 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-27 03:02
Oh, ignore "indenterror() (inconsistent use of tabs and spaces in indentation) and", I forgot to remove it. indenterror() is correct.
msg124731 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-12-27 20:12
Fixed by r87518.
History
Date User Action Args
2022-04-11 14:57:10adminsetgithub: 54987
2010-12-27 20:12:31vstinnersetstatus: open -> closed

messages: + msg124731
resolution: fixed
2010-12-27 03:02:40vstinnersetmessages: + msg124703
2010-12-27 02:06:53vstinnersetmessages: + msg124695
2010-12-27 01:57:01vstinnercreate