Issue 21979: SyntaxError not raised on "0xaor 1"

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/66178

classification

Title:	SyntaxError not raised on "0xaor 1"
Type:	behavior	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.3, Python 2.7

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	eric.smith, mark.dickinson, mel
Priority:	normal	Keywords:

Created on 2014-07-14 10:33 by mel, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)
msg223009 - (view)	Author: Mika Eloranta (mel)	Date: 2014-07-14 10:33
The following are expected to raise SyntaxError, but they don't: >>> 0xfor python is weird in ways 15 >>> [0xaor 1, 0xbor 1, 0xcor 1, 0xdor 1, 0xeor 1, 0xfor 1] [10, 11, 12, 13, 14, 15] Verified on v2.7.1 and v3.3.2. Probably affects all versions.
msg223010 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2014-07-14 11:18
Surprisingly, this is valid syntax. Your first line is parsed as: 0xf or (python is weird in ways) Because `or` is short-circuiting, its right-hand operand (which is also valid syntactically, being a chained comparison) is never evaluated, so we don't see the `NameErrors` that you might expect. >>> python is weird in ways Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'python' is not defined Your second example is similar. Closing as 'not a bug'.
msg223011 - (view)	Author: Mika Eloranta (mel)	Date: 2014-07-14 11:26
Mark, can you explain why the first example is valid syntax, but the second one is not: >>> 0xaor 1 10 >>> 0xaand 1 File "<stdin>", line 1 0xaand 1 ^ SyntaxError: invalid syntax I do understand how "0xaor 1" is currently parsed, but I'm not still convinced it should be valid syntax ("0xa or 1" or "(0xa)or 1" would be ok).
msg223012 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2014-07-14 11:34
0xaand 1 is parsed as 0xaa nd 1 which is not valid syntax.
msg223013 - (view)	Author: Mika Eloranta (mel)	Date: 2014-07-14 11:36
OK, I see... "0xfand 1" is ambiguous "(0xfa)nd 1" vs. "(0xf)and 1". So, while a bit weird, the behavior is consistent: >>> 123not in [], 0xfnot in [], 0xfor 1, 0xafor 1, 0xfin [] (True, True, 15, 175, False)
msg223014 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2014-07-14 11:37
To be more clear: the parser takes the longest token that could be valid. Since "n" can't be part of a hex number, parsing stops there, returning "0xaa" as the first token. So: >>> 0xaaif 1 else 0 170 >>> hex(0xaaif 1 else 0) '0xaa' >>>
msg223017 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2014-07-14 12:02
> Mark, can you explain why the first example is valid syntax, but the second one is not: Looks like Eric beat me to it! As he explained, it's the "maximal munch" rule at work: the tokenizer matches as much as it can for each token. You see similar effects with integer or float literals followed by a keyword starting with 'e' (or 'j', but I don't think we have any of those). >>> 3if 1else 2 File "<stdin>", line 1 3if 1else 2 ^ SyntaxError: invalid token >>> 3if 1 else 2 3

History
Date	User	Action	Args
2022-04-11 14:58:05	admin	set	github: 66178
2014-07-14 12:02:44	mark.dickinson	set	messages: + msg223017
2014-07-14 11:37:51	eric.smith	set	messages: + msg223014
2014-07-14 11:36:53	mel	set	messages: + msg223013
2014-07-14 11:35:34	eric.smith	set	stage: resolved
2014-07-14 11:34:22	eric.smith	set	nosy: + eric.smith messages: + msg223012
2014-07-14 11:26:20	mel	set	messages: + msg223011
2014-07-14 11:18:15	mark.dickinson	set	status: open -> closed nosy: + mark.dickinson messages: + msg223010 resolution: not a bug
2014-07-14 10:33:45	mel	create