This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tokenization assuming ASCII whitespace; missing multiline case
Type: Stage:
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Jim.Jewett, benjamin.peterson, python-dev
Priority: normal Keywords:

Created on 2012-01-19 20:52 by Jim.Jewett, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg151652 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2012-01-19 20:52
Parser/parsetok.c was recently changed (e.g. http://hg.python.org/cpython/rev/2bd7f40108b4 ) to raise an error if multiple statements were found in a single-statement compile call.  It sensibly ignores trailing whitespace and comments.  Unfortunately,

(1)  It looks only at (c == ' ' || c == '\t' || c == '\n' || c == '\014') as opposed to using Py_UNICODE_ISSPACE(ch)
(2)  It assumes that a "#" means the rest of the line is OK, instead of looking for additional linebreaks.

Not sure whether to mark this a bug or an enhancement, since it is already strictly better than the 3.2 behavior of never warning about extra text.
msg151658 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-01-19 22:48
New changeset 00c4efbf57c3 by Benjamin Peterson in branch 'default':
check after comments, too (#13832)
http://hg.python.org/cpython/rev/00c4efbf57c3
msg151659 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-01-19 22:49
The tokenizer doesn't consider unicode spaces, either.
msg151693 - (view) Author: Jim Jewett (Jim.Jewett) * (Python triager) Date: 2012-01-20 15:58
Ignoring non-ascii whitespace is defensible, and I agree that it should match the rest of the parser.  Ignoring 2nd lines is still a problem, and supposedly part of what got fixed.  Test case:

s="""x=5  # comment
x=6
"""
compile(s, "<testbadsingle>", 'single')
msg151694 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-01-20 16:00
$ ./python 
Python 3.3.0a0 (default:50a4af2ca654+, Jan 20 2012, 10:59:48) 
[GCC 4.5.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> s="""x=5  # comment
... x=6
... """
>>> compile(s, "<blah>", "single")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<blah>", line 1
    x=5  # comment
                 ^
SyntaxError: multiple statements found while compiling a single statement
History
Date User Action Args
2022-04-11 14:57:25adminsetgithub: 58040
2012-01-20 16:00:40benjamin.petersonsetmessages: + msg151694
2012-01-20 15:58:16Jim.Jewettsetmessages: + msg151693
2012-01-19 22:49:12benjamin.petersonsetstatus: open -> closed
resolution: fixed
messages: + msg151659
2012-01-19 22:48:32python-devsetnosy: + python-dev
messages: + msg151658
2012-01-19 21:43:25pitrousetnosy: + benjamin.peterson
2012-01-19 20:52:36Jim.Jewettcreate