Issue 21642: "_ if 1else _" does not compile

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/65841

classification

Title:	"_ if 1else _" does not compile
Type:	compile error	Stage:	resolved
Components:		Versions:	Python 3.4, Python 3.5

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	Joshua.Landau, benjamin.peterson, loewis, python-dev, r.david.murray, steve.dower, wim.glenn
Priority:	normal	Keywords:

Created on 2014-06-02 18:24 by Joshua.Landau, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (6)
msg219614 - (view)	Author: Joshua Landau (Joshua.Landau) *	Date: 2014-06-02 18:24
By the docs, Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens. Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens). "_ if 1else _" should compile equivalently to "_ if 1 else _". The tokenize module does this correctly: import io import tokenize def print_tokens(string): tokens = tokenize.tokenize(io.BytesIO(string.encode("utf8")).readline) for token in tokens: print("{:12}{}".format(tokenize.tok_name[token.type], token.string)) print_tokens("_ if 1else _") #>>> ENCODING utf-8 #>>> NAME _ #>>> NAME if #>>> NUMBER 1 #>>> NAME else #>>> NAME _ #>>> ENDMARKER but it fails when compiled with, say, "compile", "eval" or "ast.parse". import ast compile("_ if 1else _", "", "eval") #>>> Traceback (most recent call last): #>>> File "", line 32, in <module> #>>> File "<string>", line 1 #>>> _ if 1else _ #>>> ^ #>>> SyntaxError: invalid token eval("_ if 1else _") #>>> Traceback (most recent call last): #>>> File "", line 40, in <module> #>>> File "<string>", line 1 #>>> _ if 1else _ #>>> ^ #>>> SyntaxError: invalid token ast.parse("_ if 1else _") #>>> Traceback (most recent call last): #>>> File "", line 48, in <module> #>>> File "/usr/lib/python3.4/ast.py", line 35, in parse #>>> return compile(source, filename, mode, PyCF_ONLY_AST) #>>> File "<unknown>", line 1 #>>> _ if 1else _ #>>> ^ #>>> SyntaxError: invalid token Further, some other forms work: 1 if 0b1else 0 #>>> 1 1 if 1jelse 0 #>>> 1 See http://stackoverflow.com/questions/23998026/why-isnt-this-a-syntax-error-in-python particularly, http://stackoverflow.com/a/23998128/1763356 for details.
msg219662 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2014-06-03 05:57
For those who want to skip reading the entire SO question: "1else" tokenizes as "1e" "lse", i.e. 1e is considered the beginning of floating point literal. By the description in the docs, that should not happen, since it is not a valid literal on its own, so no space should be needed after the 1 to tokenize it as an integer literal.
msg219707 - (view)	Author: Joshua Landau (Joshua.Landau) *	Date: 2014-06-03 17:04
Here's a minimal example of the difference: 1e #>>> ... etc ... #>>> SyntaxError: invalid token 1t #>>> ... etc ... #>>> SyntaxError: invalid syntax
msg219965 - (view)	Author: Roundup Robot (python-dev)	Date: 2014-06-07 19:40
New changeset 4ad33d82193d by Benjamin Peterson in branch '3.4': allow the keyword else immediately after (no space) an integer (closes #21642) http://hg.python.org/cpython/rev/4ad33d82193d New changeset 29d34f4f8900 by Benjamin Peterson in branch '2.7': allow the keyword else immediately after (no space) an integer (closes #21642) http://hg.python.org/cpython/rev/29d34f4f8900 New changeset d5998cca01a8 by Benjamin Peterson in branch 'default': merge 3.4 (#21642) http://hg.python.org/cpython/rev/d5998cca01a8
msg241114 - (view)	Author: Steve Dower (steve.dower) *	Date: 2015-04-15 15:24
FTR, I think this was a bad fix and we should have just changed the spec to require a space between numeric literals and identifiers. Closing as by design would have been fine in my opinion as well, since the spec says spaces are required when it's ambiguous, and this case looks fairly ambiguous. There's also a bit of a slippery slope here where we now have to fix "0x1and 3" or be very explicit about why it is different. I haven't even mentioned changing the parser in a dot release. That seems somewhat ridiculous. Everyone else who writes a Python parser (all the IDEs and type checkers, other implementations, etc.) would prefer it if we didn't need our tokenisers to look ahead two characters.
msg241121 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2015-04-15 16:02
My impression is that it was fixed the way it was because it makes the internal tokenizer match the what the tokenize module does. See also issue 3353. As for changing it in a point release, it turns something that was an error into something that isn't, so it was unlikely to break existing working code. Going the other way in the tokenize module would have been a backward compatibility issue. If we wanted to change this, it would require a deprecation process, and it hardly seems worth it. I hear you about other tokenizers, though, and that is indeed unfortunate.

History
Date	User	Action	Args
2022-04-11 14:58:04	admin	set	github: 65841
2015-04-15 16:02:21	r.david.murray	set	nosy: + r.david.murray messages: + msg241121
2015-04-15 15:24:10	steve.dower	set	nosy: + steve.dower messages: + msg241114 versions: + Python 3.5
2014-06-07 19:40:01	python-dev	set	status: open -> closed nosy: + python-dev messages: + msg219965 resolution: fixed stage: resolved
2014-06-03 17:04:03	Joshua.Landau	set	messages: + msg219707
2014-06-03 05:57:39	loewis	set	nosy: + loewis messages: + msg219662
2014-06-02 20:58:40	wim.glenn	set	nosy: + wim.glenn
2014-06-02 20:30:53	vstinner	set	nosy: + benjamin.peterson
2014-06-02 18:24:00	Joshua.Landau	create