"_ if 1else _" does not compile #65841

JoshuaLandau · 2014-06-02T18:24:01Z

BPO	21642
Nosy	@loewis, @benjaminp, @bitdancer, @zooba, @wimglenn

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-06-07.19:40:01.701>
created_at = <Date 2014-06-02.18:24:00.740>
labels = ['build']
title = '"_ if 1else _" does not compile'
updated_at = <Date 2015-04-15.16:02:21.581>
user = 'https://bugs.python.org/JoshuaLandau'

bugs.python.org fields:

activity = <Date 2015-04-15.16:02:21.581>
actor = 'r.david.murray'
assignee = 'none'
closed = True
closed_date = <Date 2014-06-07.19:40:01.701>
closer = 'python-dev'
components = []
creation = <Date 2014-06-02.18:24:00.740>
creator = 'Joshua.Landau'
dependencies = []
files = []
hgrepos = []
issue_num = 21642
keywords = []
message_count = 6.0
messages = ['219614', '219662', '219707', '219965', '241114', '241121']
nosy_count = 7.0
nosy_names = ['loewis', 'benjamin.peterson', 'r.david.murray', 'python-dev', 'Joshua.Landau', 'steve.dower', 'wim.glenn']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'compile error'
url = 'https://bugs.python.org/issue21642'
versions = ['Python 3.4', 'Python 3.5']

JoshuaLandau · 2014-06-02T18:24:00Z

By the docs,

Except at the beginning of a logical line or in
string literals, the whitespace characters space,
tab and formfeed can be used interchangeably to
separate tokens. Whitespace is needed between two
tokens only if their concatenation could otherwise
be interpreted as a different token
(e.g., ab is one token, but a b is two tokens).

"_ if 1else _" should compile equivalently to "_ if 1 else _".

The tokenize module does this correctly:

    import io
    import tokenize

    def print_tokens(string):
        tokens = tokenize.tokenize(io.BytesIO(string.encode("utf8")).readline)    

        for token in tokens:
            print("{:12}{}".format(tokenize.tok_name[token.type], token.string))

    print_tokens("_ if 1else _")
    #>>> ENCODING    utf-8
    #>>> NAME        _
    #>>> NAME        if
    #>>> NUMBER      1
    #>>> NAME        else
    #>>> NAME        _
    #>>> ENDMARKER

but it fails when compiled with, say, "compile", "eval" or "ast.parse".

    import ast

    compile("_ if 1else _", "", "eval")
    #>>> Traceback (most recent call last):
    #>>>   File "", line 32, in <module>
    #>>>   File "<string>", line 1
    #>>>     _ if 1else _
    #>>>           ^
    #>>> SyntaxError: invalid token

    eval("_ if 1else _")
    #>>> Traceback (most recent call last):
    #>>>   File "", line 40, in <module>
    #>>>   File "<string>", line 1
    #>>>     _ if 1else _
    #>>>           ^
    #>>> SyntaxError: invalid token

    ast.parse("_ if 1else _")
    #>>> Traceback (most recent call last):
    #>>>   File "", line 48, in <module>
    #>>>   File "/usr/lib/python3.4/ast.py", line 35, in parse
    #>>>     return compile(source, filename, mode, PyCF_ONLY_AST)
    #>>>   File "<unknown>", line 1
    #>>>     _ if 1else _
    #>>>           ^
    #>>> SyntaxError: invalid token

Further, some other forms work:

    1 if 0b1else 0
    #>>> 1

    1 if 1jelse 0
    #>>> 1

See

http://stackoverflow.com/questions/23998026/why-isnt-this-a-syntax-error-in-python

particularly,

http://stackoverflow.com/a/23998128/1763356

for details.

loewis · 2014-06-03T05:57:39Z

For those who want to skip reading the entire SO question: "1else" tokenizes as "1e" "lse", i.e. 1e is considered the beginning of floating point literal. By the description in the docs, that should not happen, since it is not a valid literal on its own, so no space should be needed after the 1 to tokenize it as an integer literal.

JoshuaLandau · 2014-06-03T17:04:03Z

Here's a minimal example of the difference:

    1e
    #>>> ... etc ...
    #>>> SyntaxError: invalid token

    1t
    #>>> ... etc ...
    #>>> SyntaxError: invalid syntax

python-dev · 2014-06-07T19:40:02Z

New changeset 4ad33d82193d by Benjamin Peterson in branch '3.4':
allow the keyword else immediately after (no space) an integer (closes bpo-21642)
http://hg.python.org/cpython/rev/4ad33d82193d

New changeset 29d34f4f8900 by Benjamin Peterson in branch '2.7':
allow the keyword else immediately after (no space) an integer (closes bpo-21642)
http://hg.python.org/cpython/rev/29d34f4f8900

New changeset d5998cca01a8 by Benjamin Peterson in branch 'default':
merge 3.4 (bpo-21642)
http://hg.python.org/cpython/rev/d5998cca01a8

zooba · 2015-04-15T15:24:10Z

FTR, I think this was a bad fix and we should have just changed the spec to require a space between numeric literals and identifiers.

Closing as by design would have been fine in my opinion as well, since the spec says spaces are required when it's ambiguous, and this case looks fairly ambiguous. There's also a bit of a slippery slope here where we now have to fix "0x1and 3" or be very explicit about why it is different.

I haven't even mentioned changing the parser in a dot release. That seems somewhat ridiculous.

Everyone else who writes a Python parser (all the IDEs and type checkers, other implementations, etc.) would prefer it if we didn't need our tokenisers to look ahead two characters.

bitdancer · 2015-04-15T16:02:21Z

My impression is that it was fixed the way it was because it makes the internal tokenizer match the what the tokenize module does. See also bpo-3353. As for changing it in a point release, it turns something that was an error into something that isn't, so it was unlikely to break existing working code. Going the other way in the tokenize module *would* have been a backward compatibility issue. If we wanted to change this, it would require a deprecation process, and it hardly seems worth it. I hear you about other tokenizers, though, and that is indeed unfortunate.

JoshuaLandau mannequin added the build The build process and cross-build label Jun 2, 2014

python-dev mannequin closed this as completed Jun 7, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"_ if 1else _" does not compile #65841

"_ if 1else _" does not compile #65841

JoshuaLandau mannequin commented Jun 2, 2014

JoshuaLandau mannequin commented Jun 2, 2014

loewis mannequin commented Jun 3, 2014

JoshuaLandau mannequin commented Jun 3, 2014

python-dev mannequin commented Jun 7, 2014

zooba commented Apr 15, 2015

bitdancer commented Apr 15, 2015

"_ if 1else _" does not compile #65841

"_ if 1else _" does not compile #65841

Comments

JoshuaLandau mannequin commented Jun 2, 2014

JoshuaLandau mannequin commented Jun 2, 2014

loewis mannequin commented Jun 3, 2014

JoshuaLandau mannequin commented Jun 3, 2014

python-dev mannequin commented Jun 7, 2014

zooba commented Apr 15, 2015

bitdancer commented Apr 15, 2015