classification
Title: Tutorial: Example of Source Code Encoding triggers error
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.araujo Nosy List: docs@python, eric.araujo, ezio.melotti, miss-islington, nicolasg, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2011-12-03 13:17 by nicolasg, last changed 2018-05-09 09:36 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
cp_1252broken.py nicolasg, 2011-12-03 13:17 File written following the example
windows_1252ok.py nicolasg, 2011-12-03 13:19 Using windows-1252 works
Pull Requests
URL Status Linked Edit
PR 6738 merged serhiy.storchaka, 2018-05-09 07:14
PR 6742 merged miss-islington, 2018-05-09 08:11
PR 6743 merged miss-islington, 2018-05-09 08:12
PR 6744 merged serhiy.storchaka, 2018-05-09 08:19
Messages (7)
msg148798 - (view) Author: Nicolas Goutte (nicolasg) Date: 2011-12-03 13:17
Current Behaviour

The tutorial of Python 3.2.x has an example to set an encoding in a source file: http://docs.python.org/py3k/tutorial/interpreter.html#source-code-encoding

It explains to set the following line at the start of the source code:
# -*- coding: cp-1252 -*-

However when done exactly so, Python raises the following exception:
SyntaxError: encoding problem: with BOM

The problem seems to be that Python knows Windows codepage 1252 as windows-1252 (its IANA charset name, see http://www.iana.org/assignments/charset-reg/windows-1252 ) or alternatively as cp1252 (without dash) but not as cp-1252 (with dash).

As this is an example in the tutorial is particularly problematic, as users might not understand how to do it correctly.

This is still the case in the tutorial of Python 3.3 alpha: http://docs.python.org/dev/tutorial/interpreter.html#source-code-encoding


Expected Behaviour

The tutorial should give a correct example, for example with:
# -*- coding: windows-1252 -*-

Alternatively a totally other example as for Python 2.7 would be nice too: http://docs.python.org/tutorial/interpreter.html#source-code-encoding


Notes:
I have tested this with following Python implementations:
- Python 3.2.1 (openSUSE 12.1) on Linux
- Python 3.2.2 on Windows 7 SP1 64 Bits
- Python 3.2.2 on MacOS 10.5.8
(Always on the command line; I have not tested in IDLE.)
msg149161 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-12-10 15:52
Thanks for the detailed bug report.  I thought the normalization performed by the codec lookup system would convert 'cp-1252' to 'cp1252' (its “real” name, i.e. the name of the module implementing the codec), but it does not.  I’m +1 to removing the hyphen in the example, then.

> Python raises the following exception:
> SyntaxError: encoding problem: with BOM
I reproduced this and it’s surprising.  Maybe there is a bug with error reporting here.

> Alternatively a totally other example as for Python 2.7 would be nice too
This file has seen different changes in 2.7 and 3.2, given that the default encoding is different in 3.x.  I’ll check the history and upload a patch here to get your feedback.
msg316313 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-09 07:13
Error message was fixed. In all supported versions it is:

$ ./python cp_1252broken.py
  File "cp_1252broken.py", line 1
SyntaxError: encoding problem: cp-1252

But the tutorial still contains non-working example. This is an easy issue, but it was open for long time.
msg316314 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-09 08:10
New changeset ddb6215a55b0218b621d5cb755e9dfac8dab231a by Serhiy Storchaka in branch 'master':
bpo-13525: Fix incorrect encoding name in the tutorial example. (GH-6738)
https://github.com/python/cpython/commit/ddb6215a55b0218b621d5cb755e9dfac8dab231a
msg316316 - (view) Author: miss-islington (miss-islington) Date: 2018-05-09 08:54
New changeset 8ffff34ea12ca6478d73a337ce52f33660f6f174 by Miss Islington (bot) in branch '3.7':
bpo-13525: Fix incorrect encoding name in the tutorial example. (GH-6738)
https://github.com/python/cpython/commit/8ffff34ea12ca6478d73a337ce52f33660f6f174
msg316317 - (view) Author: miss-islington (miss-islington) Date: 2018-05-09 09:00
New changeset fa40fc0593012893e447875632e9ed3df277561f by Miss Islington (bot) in branch '3.6':
bpo-13525: Fix incorrect encoding name in the tutorial example. (GH-6738)
https://github.com/python/cpython/commit/fa40fc0593012893e447875632e9ed3df277561f
msg316320 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-09 09:35
New changeset d7e783b17feaedbe0f5b30467cb7f43cefadf904 by Serhiy Storchaka in branch '2.7':
[2.7] bpo-13525: Fix incorrect encoding name in the tutorial example. (GH-6738). (GH-6744)
https://github.com/python/cpython/commit/d7e783b17feaedbe0f5b30467cb7f43cefadf904
History
Date User Action Args
2018-05-09 09:36:35serhiy.storchakasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.6, Python 3.7, Python 3.8, - Python 3.2, Python 3.3
2018-05-09 09:35:38serhiy.storchakasetmessages: + msg316320
2018-05-09 09:00:24miss-islingtonsetmessages: + msg316317
2018-05-09 08:54:41miss-islingtonsetnosy: + miss-islington
messages: + msg316316
2018-05-09 08:19:12serhiy.storchakasetpull_requests: + pull_request6431
2018-05-09 08:12:08miss-islingtonsetpull_requests: + pull_request6430
2018-05-09 08:11:14miss-islingtonsetpull_requests: + pull_request6429
2018-05-09 08:10:58serhiy.storchakasetmessages: + msg316314
2018-05-09 07:14:08serhiy.storchakasetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request6427
2018-05-09 07:13:07serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg316313
2012-07-25 22:50:45ezio.melottisettype: enhancement
2011-12-10 15:52:16eric.araujosetassignee: docs@python -> eric.araujo

messages: + msg149161
nosy: + eric.araujo
2011-12-03 13:23:54ezio.melottisetnosy: + ezio.melotti
stage: needs patch

versions: + Python 2.7, Python 3.3
2011-12-03 13:19:06nicolasgsetfiles: + windows_1252ok.py
2011-12-03 13:17:22nicolasgcreate