This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SyntaxError not raised on "0xaor 1"
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, mark.dickinson, mel
Priority: normal Keywords:

Created on 2014-07-14 10:33 by mel, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)
msg223009 - (view) Author: Mika Eloranta (mel) Date: 2014-07-14 10:33
The following are expected to raise SyntaxError, but they don't:

>>> 0xfor python is weird in ways
15
>>> [0xaor 1, 0xbor 1, 0xcor 1, 0xdor 1, 0xeor 1, 0xfor 1]
[10, 11, 12, 13, 14, 15]

Verified on v2.7.1 and v3.3.2. Probably affects all versions.
msg223010 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2014-07-14 11:18
Surprisingly, this is valid syntax.  Your first line is parsed as:

    0xf or (python is weird in ways)

Because `or` is short-circuiting, its right-hand operand (which is also valid syntactically, being a chained comparison) is never evaluated, so we don't see the `NameErrors` that you might expect.

    >>> python is weird in ways
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'python' is not defined

Your second example is similar.  Closing as 'not a bug'.
msg223011 - (view) Author: Mika Eloranta (mel) Date: 2014-07-14 11:26
Mark, can you explain why the first example is valid syntax, but the second one is not:

>>> 0xaor 1
10

>>> 0xaand 1
  File "<stdin>", line 1
    0xaand 1
         ^
SyntaxError: invalid syntax


I do understand how "0xaor 1" is currently parsed, but I'm not still convinced it should be valid syntax ("0xa or 1" or "(0xa)or 1" would be ok).
msg223012 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-07-14 11:34
0xaand 1
is parsed as
0xaa nd 1

which is not valid syntax.
msg223013 - (view) Author: Mika Eloranta (mel) Date: 2014-07-14 11:36
OK, I see... "0xfand 1" is ambiguous "(0xfa)nd 1" vs. "(0xf)and 1".

So, while a bit weird, the behavior is consistent:

>>> 123not in [], 0xfnot in [], 0xfor 1, 0xafor 1, 0xfin []
(True, True, 15, 175, False)
msg223014 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2014-07-14 11:37
To be more clear: the parser takes the longest token that could be valid. Since "n" can't be part of a hex number, parsing stops there, returning "0xaa" as the first token.

So:

>>> 0xaaif 1 else 0
170
>>> hex(0xaaif 1 else 0)
'0xaa'
>>>
msg223017 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2014-07-14 12:02
> Mark, can you explain why the first example is valid syntax, but the second one is not:

Looks like Eric beat me to it!  As he explained, it's the "maximal munch" rule at work: the tokenizer matches as much as it can for each token.

You see similar effects with integer or float literals followed by a keyword starting with 'e' (or 'j', but I don't think we have any of those).

>>> 3if 1else 2
  File "<stdin>", line 1
    3if 1else 2
         ^
SyntaxError: invalid token
>>> 3if 1 else 2
3
History
Date User Action Args
2022-04-11 14:58:05adminsetgithub: 66178
2014-07-14 12:02:44mark.dickinsonsetmessages: + msg223017
2014-07-14 11:37:51eric.smithsetmessages: + msg223014
2014-07-14 11:36:53melsetmessages: + msg223013
2014-07-14 11:35:34eric.smithsetstage: resolved
2014-07-14 11:34:22eric.smithsetnosy: + eric.smith
messages: + msg223012
2014-07-14 11:26:20melsetmessages: + msg223011
2014-07-14 11:18:15mark.dickinsonsetstatus: open -> closed

nosy: + mark.dickinson
messages: + msg223010

resolution: not a bug
2014-07-14 10:33:45melcreate