classification
Title: Missing trailing newline with comment raises SyntaxError
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.0, Python 3.1, Python 2.6, Python 2.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: benjamin.peterson, ehuss, georg.brandl, inyeollee, ncoghlan, ptn, terry.reedy
Priority: normal Keywords: easy

Created on 2005-04-16 01:55 by ehuss, last changed 2010-05-20 02:25 by benjamin.peterson. This issue is now closed.

Messages (10)
msg60730 - (view) Author: Eric Huss (ehuss) Date: 2005-04-16 01:55
The following illustrates a problem with the parser 
handling the lack of trailing newlines:

>>> parser.suite('def foo():\n\tpass\n\n# comment')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<string>", line 4
    # comment
           ^
SyntaxError: invalid syntax
>>> parser.suite('def foo():\n\tpass\n\n# comment\n')
<parser.st object at 0x847f0a0>

This is similar to bug 501622, however, this only seems 
to happen when you have an indented block, followed by 
a comment line that has no trailing newline.

I traced through tokenizer.c and whittled down the issue 
into tok_get().  In the statement where it is processing 
the comment character and looking at the tabforms, in 
the first case this will end up with 'c' equal to EOF 
whereas in the second case "c" will eqaul '\n'.  When it 
equals EOF, it is unable to do the cleanup necessary to 
emit the DEDENT token (it immediately bails out with 
ENDMARKER which causes parsetok() to barf because 
the indentation level is still 1 inside tok_state).

Attached is a patch of a little hack I made that seems 
to fix the problem.  Although it seems to be a safe thing 
to do, it is definitely a hack.
msg60731 - (view) Author: Eric Huss (ehuss) Date: 2005-04-16 01:57
Logged In: YES 
user_id=393416

Well, wonderful sourceforge is barfing with the 
error "ArtifactFile: Could not open file for writing" when trying 
to upload my patch, so I'll just post it in the comment here.  
Very sorry. :(

--- tokenizer.c	3 Feb 2004 22:53:59 -0000	1.2
+++ tokenizer.c	16 Apr 2005 01:45:05 -0000
@@ -1139,6 +1139,9 @@
 		}
 		while (c != EOF && c != '\n')
 			c = tok_nextc(tok);
+		if (c == EOF) {
+			c = '\n';
+		}
 	}
 	
 	/* Check for EOF and errors now */
msg60732 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2006-03-17 13:51
Logged In: YES 
user_id=1038590

Confirmed on SVN HEAD using:

exec """
def foo():
    pass

#comment"""

(Blows up with a syntax error)
msg60733 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2006-03-17 16:48
Logged In: YES 
user_id=593130

As noted by F.Lundh on pydev list,
http://docs.python.org/lib/built-in-funcs.html
says "When compiling multi-line statements, two caveats 
apply: [...] and the input must be terminated by at least 
one newline character" so it appears that doc == behavior.
Should this be closed? or both changed?
msg65195 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-04-08 17:01
As of 2.5.1, a missing trailing newline no longer causes a Syntax Error,
making the second part of the caveat in the documentation unnecessary.

Changing to a documentation bug applicable to 2.5+.
msg65247 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-04-09 17:55
The exec example still presents me a syntax error in 2.5.2.
msg65595 - (view) Author: Inyeol Lee (inyeollee) Date: 2008-04-17 22:11
Missing trailing newline still triggers error as of 2.5.1:

>>> import parser
>>> parser.suite("pass\n ")
IndentationError: unexpected indent
>>> parser.suite("if True:\n pass\n ")
SyntaxError: invalid syntax
msg65619 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-04-18 17:31
Yeah, it's actually still blowing up for me to. I have no idea what I
actually tested when I thought it was working in 2.6/3.0 - I must have
managed to sneak an extra carriage return into the test string. So
reverting back to marking it as a non-easy interpreter core problem.
msg89051 - (view) Author: Pablo Torres Navarrete (ptn) Date: 2009-06-07 19:31
Confirmed on versions 2.6.2, 3.0.1 and 3.1rc1.  On the three of them, I
tried this:

>>> import parser 
>>> test = 'def foo():\n\tpass\n\n# comment'
>>> parser.suite(test)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 4
    # comment
           ^
SyntaxError: invalid syntax
>>>
msg106125 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-05-20 02:25
This has been fixed in 2.7 and 3.2.
History
Date User Action Args
2010-05-20 02:25:35benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg106125

resolution: out of date
2009-06-07 19:31:47ptnsetversions: + Python 3.1
2009-06-07 19:31:01ptnsetnosy: + ptn
messages: + msg89051
2008-04-18 17:31:02ncoghlansetmessages: + msg65619
components: + Interpreter Core, - Documentation
2008-04-17 22:11:33inyeolleesetnosy: + inyeollee
messages: + msg65595
2008-04-09 17:55:04georg.brandlsetmessages: + msg65247
2008-04-08 17:01:14ncoghlansetversions: + Python 2.6, Python 2.5, Python 3.0
nosy: + georg.brandl
messages: + msg65195
assignee: georg.brandl
components: + Documentation, - Interpreter Core
keywords: + easy
type: behavior
2005-04-16 01:55:06ehusscreate