This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: encoding problem: coding:gbk cause syntaxError
Type: behavior Stage:
Components: Windows Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Windson Yang, anmikf, eamanu, malin, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2018-11-02 02:33 by anmikf, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
encoding_problem_gbk.py anmikf, 2018-11-02 02:33
Messages (12)
msg329098 - (view) Author: 安迷 (anmikf) Date: 2018-11-02 02:33
OS 名称:          Microsoft Windows 10 专业版
OS 版本:          10.0.15063 暂缺 Build 15063
OS 制造商:        Microsoft Corporation
OS 配置:          独立工作站
OS 构件类型:      Multiprocessor Free
注册的所有人:     Windows 用户
注册的组织:
产品 ID:          00330-80000-00000-AA183
初始安装日期:     2017/04/10, 17:24:40
系统启动时间:     2018/09/18, 09:44:52
系统制造商:       Dell Inc.
系统型号:         OptiPlex 9010
系统类型:         x64-based PC

Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)] on win32
msg329108 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-02 06:34
If I understand your question correctly, you should save the file(the one contain Chinese chars) with GBK encoding using your editor. Otherwise, your editor would save it using the default encoding which led to python can't decode it correctly.
msg329113 - (view) Author: Ma Lin (malin) * Date: 2018-11-02 08:07
Let me give an explanation.
Run encoding_problem_gbk.py, get an error:

D:\>encoding_problem_gbk.py
  File "D:\encoding_problem_gbk.py", line 1
SyntaxError: encoding problem: gbk

If remove the comment line, run as expected.
msg329117 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-02 08:40
Thank you, Lin. Can you reproduce on your machine, I guess it is related to terminal encoding or text file ending. However, I can't reproduce on macOS.
msg329118 - (view) Author: Ma Lin (malin) * Date: 2018-11-02 09:09
Yes, I can reproduce on my Windows 10 (Simplfied Chinese).
The file is a pure ASCII file, and doesn't have a BOM prefix.
msg329119 - (view) Author: 安迷 (anmikf) Date: 2018-11-02 09:44
this problem not exist on macOS.
this problem not exist in python2.

Windows10x64   Python 3.7.0 (v3.7.0:1bf9cc5093

script have no problem with                             15 blank lines.
script have    problem with fist line '#coding:gbk' and 14 blank lines.
msg329120 - (view) Author: 安迷 (anmikf) Date: 2018-11-02 09:48
I'm sorry for my english.
Can I use Chinese?
msg329121 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2018-11-02 09:53
I'm afraid you'll have to use English in this forum so that all current and future readers have the best chance of understanding the situation. Thank you very much for making the effort this far.

If anyone on this issue knows of a Chinese-language forum where this issue could explored before coming back here, please say so. Otherwise I'll ask around on Twitter etc. to see what's available
msg329122 - (view) Author: Windson Yang (Windson Yang) * Date: 2018-11-02 10:50
It's fine @anmikf, keep practice :D. Let's recap what happened:

Run encoding_problem_gbk.py on Windows10 using Python 3.7.0 will cause "SyntaxError: encoding problem: gbk". But it will run as expected if

1. The file has less than less than 15 lines.
2. Change coding:gbk to other encoding (like utf-8)
3. Remove coding:gbk
msg329658 - (view) Author: Ma Lin (malin) * Date: 2018-11-11 03:31
I debugged, this is a duplicate of issue 20844 and issue 27797.
Eryk Sun analyzed this detailedly, it's a problem of Windows CRT.
msg329674 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-11-11 14:13
Yes, seems like we should be opening the file in binary mode, though I haven't tried it. The CRT's interpretation of text mode really isn't compatible with Python's own interpretation of text mode, and chaining them makes even less sense.
msg329686 - (view) Author: Emmanuel Arias (eamanu) * Date: 2018-11-11 20:48
I can not reproduce this issue on my Debian9.
History
Date User Action Args
2022-04-11 14:59:07adminsetgithub: 79321
2018-11-11 20:48:47eamanusetnosy: + eamanu
messages: + msg329686
2018-11-11 14:13:14steve.dowersetmessages: + msg329674
2018-11-11 03:31:31malinsetmessages: + msg329658
versions: + Python 3.5, Python 3.6, Python 3.8
2018-11-02 10:52:00Windson Yangsettitle: encoding problem: gbk -> encoding problem: coding:gbk cause syntaxError
2018-11-02 10:50:28Windson Yangsetmessages: + msg329122
2018-11-02 09:53:12tim.goldensetmessages: + msg329121
2018-11-02 09:48:51anmikfsetmessages: + msg329120
2018-11-02 09:44:27anmikfsetmessages: + msg329119
2018-11-02 09:09:38malinsetmessages: + msg329118
2018-11-02 08:40:53Windson Yangsetmessages: + msg329117
2018-11-02 08:07:52malinsetnosy: + malin
messages: + msg329113
2018-11-02 06:34:05Windson Yangsetnosy: + Windson Yang
messages: + msg329108
2018-11-02 05:46:53xtreaklinkissue35141 superseder
2018-11-02 02:33:33anmikfcreate