Issue 17777: Unrecognized string literal escape sequences give SyntaxErrors

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/61977

classification

Title:	Unrecognized string literal escape sequences give SyntaxErrors
Type:	enhancement	Stage:	resolved
Components:	Documentation, Unicode	Versions:	Python 3.3, Python 3.4, Python 2.7

process

Status:	closed	Resolution:	works for me
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	docs@python, ezio.melotti, markeganfuller, r.david.murray, reynir, tim.golden
Priority:	normal	Keywords:	easy

Created on 2013-04-17 15:37 by reynir, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg187173 - (view)	Author: Reynir Reynisson (reynir)	Date: 2013-04-17 15:37
Strings like "\u" trigger a SyntaxError. According to the language reference "all unrecognized escape sequences are left in the string unchanged"[0]. The string "\u" clearly doesn't match any of the escape sequences (in particular \uxxxx). This may be intentional, but it is not clear from the language reference that this is the case. If it is intentional it should probably be stated more explicit in the language reference. I think this may be confusing for new users since the syntax errors may lead them to believe the interpreter will give syntax error for all unrecognized escape sequences. [0]: http://docs.python.org/3/reference/lexical_analysis.html#literals
msg187175 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2013-04-17 15:54
It is a recognized escape sequence, but the syntax of the escape sequence is wrong, thus the syntax error. An "escape sequence" is a backslash character followed by a letter. Perhaps that is the bit that needs to be clarified in the docs?
msg187201 - (view)	Author: Reynir Reynisson (reynir)	Date: 2013-04-17 20:00
Thank you for the quick reply. Yes, something along those lines would help. Maybe adding "The escape sequence \x expects exactly two hex digits" would make it even clearer.
msg198319 - (view)	Author: Mark Egan-Fuller (markeganfuller)	Date: 2013-09-23 10:28
Python correctly throws a unicode error here, directing the user towards the fact that this is an issue specifically with the unicode escaping. >>> "\u" File "<stdin>", line 1 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape The documentation also states that "Any Unicode character can be encoded this way. Exactly eight hex digits are required."[0]. Propose closing this as Won't Fix. [0]: http://docs.python.org/3/reference/lexical_analysis.html#literals
msg198353 - (view)	Author: Tim Golden (tim.golden) *	Date: 2013-09-24 10:01
Closing as "Works for me" in the absence of any clear proposal for docs improvement.

History
Date	User	Action	Args
2022-04-11 14:57:44	admin	set	github: 61977
2013-09-24 10:01:14	tim.golden	set	status: open -> closed resolution: works for me messages: + msg198353 stage: needs patch -> resolved
2013-09-23 10:28:02	markeganfuller	set	nosy: + tim.golden, markeganfuller messages: + msg198319
2013-04-19 02:40:54	ezio.melotti	set	keywords: + easy stage: needs patch type: behavior -> enhancement versions: + Python 2.7, Python 3.4
2013-04-17 20:00:30	reynir	set	messages: + msg187201
2013-04-17 15:54:45	r.david.murray	set	nosy: + r.david.murray messages: + msg187175
2013-04-17 15:37:35	reynir	create