msg254951 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2015-11-20 05:50 |
It appears that syntax errors generated by checking the AST will get confused by indentation and place the caret incorrectly.
With no indentation:
===
1 + 1 = 2
===
File "/Users/mlayzell/test.py", line 1
1 + 1 = 2
^
SyntaxError: can't assign to operator
===
With 4 spaces of indentation:
===
if True:
1 + 1 = 2
===
File "/Users/mlayzell/test.py", line 2
1 + 1 = 2
^
SyntaxError: can't assign to operator
===
As you can see, the caret appears to be placed randomly in the middle of the second statement, apparently offset (probably by the 4 space indentation).
The above examples were run with Python3.5 on Mac OS X 10.10.4, though my somewhat-recent trunk clone exhibits the same problems.
|
msg254979 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2015-11-20 14:40 |
It appears as though the line is always printed without the leading whitespace, but the caret takes the whitespace into account (try indenting the line even further). So the problem is that the caret and the printing of the source line in the error message are not consistent about how whitespace is handled.
|
msg254981 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2015-11-20 14:44 |
By the way, python3.4 also shows the behavior. Python2 does not, because python2 does not print the caret. The bug presumably crept in when the caret was added. I'm marking this as easy...presumably the hardest part of fixing it is figuring out exactly how the message is constructed :)
|
msg254992 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2015-11-20 16:20 |
This should fix the problem. It appears as though the indentation was being stripped from the program text differently depending on how the error was produced. In the case of a syntax error from the parser, the indentation was maintained until it was printed (in print_error_text, correctly handling the offset value). In contrast, errors from the AST were created with PyErr_ProgramText*, which would also strip the indentation (without mutating any offset value).
This patch causes the second code path to not strip whitespace. This shouldn't cause I problem (I think), as it seems to only be used to instantiate SyntaxErrors.
|
msg255007 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-11-20 18:26 |
Would be nice to have a test.
|
msg255370 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2015-11-25 17:44 |
Sorry for the delay, I finally got around to adding a test.
I'm mildly concerned about the portability of the check, but it seems to work locally. If people have suggestions about how to make the check more portable, let me know!
|
msg255376 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2015-11-25 18:19 |
Two options on portability, I think: either use splitlines() on the decoded result, or add a 'universal_newlines' keyword argument to assert_python_failure and friends that would get passed to Popen. I'd favor the latter, as other users of script_helpers may find it useful. On the other hand, that may involve a whole separate discussion, so you could use splitlines here and we could open a new issue for a script_helpers enhancement :)
Also, rather than a direct re, I'd use assertRegex. That will give a more useful error message if the test fails.
|
msg255397 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-26 01:18 |
Another way to get universal newlines I have used in other cases is to use TextIOWrapper. It might be easier if you really need that multi-line RE search:
text = TextIOWrapper(BytesIO(stderr), "ascii").read()
|
msg255418 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2015-11-26 14:45 |
Martin's solution seemed like the least work, so that's what I ended up using. I also switched over to assertRegex, and I agree that it produces better error messages.
I added a comment to explain the use of the ugly multiline regex, and removed some of its unnecessary parts.
|
msg255435 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2015-11-26 21:32 |
Michael, see the review comments if you already haven’t. There is an unused variable “p” in the last two patches.
|
msg256178 - (view) |
Author: Berker Peksag (berker.peksag) * |
Date: 2015-12-10 16:33 |
With the latest patch applied, the output is
File "z.py", line 2
1 + 2 = 3
^
SyntaxError: can't assign to operator
I think the caret is still in wrong place.
I would expect the following output:
File "z.py", line 2
1 + 2 = 3
^
SyntaxError: can't assign to operator
|
msg257789 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2016-01-09 00:17 |
Sorry for missing the review comments earlier, I didn't notice them (I've not used this bug tracker before - sorry).
I've attached an updated patch which should not have the variable problem, and also takes some of the other review comments into account.
Berker: I agree that the current positioning of the caret is undesirable, and that ideally we would report the error at the site of the operator rather than the first character of the expression. I think that changing this behavior is a different bug though.
P.S: Sorry for disappearing for a month, I don't seem to get emails when there are changes made to the bug, and thus forgot to check up on the bug...
|
msg257864 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-01-09 21:39 |
+script = textwrap.dedent("1 + 1 = 2\n")
You don’t need that first textwrap.dedent() call. It is only useful if the string has extra indentation at the start, as in the second multiline one.
|
msg257870 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2016-01-09 21:47 |
Oops. Should be fixed now.
|
msg279978 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-11-03 09:17 |
LGTM.
|
msg280205 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2016-11-07 16:03 |
Is there something I should be doing to push this patch forward? I am not familiar with the contribution process for cpython unfortunately.
|
msg280206 - (view) |
Author: Berker Peksag (berker.peksag) * |
Date: 2016-11-07 16:22 |
FWIW, I don't think the patch fixes the actual problem. With the patch applied, the caret still placed at the wrong place. I'd prefer avoid pushing a patch that fixes a bug by introducing another bug.
|
msg280207 - (view) |
Author: Michael Layzell (mystor) * |
Date: 2016-11-07 16:26 |
This patch fixes a problem, which is that the caret was not placed where the source location information for the node is, but instead in a random position based on the indentation level.
The problem that the caret is not placed under the `=` sign, but instead at the front of the expression, feels like a different bug related to how source location information is computed throughout cpython for infix operators. I don't think it should be fixed in this bug, but rather in a follow-up.
Specifically, this fix (to make the printing actually respect the source location information) will also be required in order to fix the placement of the caret to under the = sign.
|
msg280215 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-11-07 17:39 |
The patch fixes one bug, and I'm going to push it.
There is other bug here. If add "1;" before the original example (so the line becomes "1;1 + 1 = 2"), the caret will be under ";". In some SyntaxErrors the caret points one position left before the start of incorrect expression. But this bug deserves separate issue.
The fact that in "1 + 1 = 2" the caret points to the first "1" instead of "=" can be considered as yet one different bug (but this is questionable).
|
msg280220 - (view) |
Author: Berker Peksag (berker.peksag) * |
Date: 2016-11-07 17:52 |
With patch applied:
File "x.py", line 2
1 + 1 = 2
^
SyntaxError: can't assign to operator
Without patch:
File "x.py", line 2
1 + 1 = 2
^
SyntaxError: can't assign to operator
The caret is located at the wrong place in both examples (which is the actual bug that needs to be fixed here.) This isn't a high priority bug and I think people can live with the current behavior until this is properly fixed. Pushing a premature patch will only introduce more code churn.
|
msg280223 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-11-07 18:13 |
This may be just bad example that it looks contradictorily to you. Is the example "None = 0" looks good to you? Without patch the caret is located at the middle of None (depending on the size of the indentation of this line). With patch applied it is located at the start of None independently of the indentation.
The original bug is that the location of the caret depends on the indentation. The patch fixes this bug (and only this bug).
|
msg282845 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-12-10 10:35 |
(Long story short: need to strip form feeds in print_error_text(), but I agree that this otherwise does fix a bug.)
There is one minor regression that this patch causes IMO. Given the following file, where <FF> represents a form feed character ('\014')
if True:
<FF> 1 + 1 = 2
the error report now has a blank line (due to outputting the FF character) and extra indentation that wasn’t stripped:
File "indent.py", line 2
1 + 1 = 2
^
SyntaxError: can't assign to operator
I think the fix for this is to strip form feed characters in print_error_text(), in Python/pythonrun.c.
Apart from that, I agree with Serhiy and Michael that it is okay to push this change. The bug that we are fixing is that SyntaxError.offset does not correspond with SyntaxError.text, due to indentation. Before the patch, the offset refers to the line without indentation removed, but the text has indentation removed. Here is a really bad case, where there is more indentation than code, print_error_text() tries to find the next line, and ends up printing a blank line:
>>> code = "if True:\n" + " "*16 + "1 + 1 = 2\n"
>>> with open("indent.py", "wt") as f: f.write(code)
...
35
>>> try: compile(code, "indent.py", "exec")
... except SyntaxError as err:
... print("offset={err.offset} text={err.text!r}".format(err=err))
... raise
...
offset=16 text='1 + 1 = 2\n'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "indent.py", line 2
^
SyntaxError: can't assign to operator
In Michael’s original report, I was curious why the caret is only shifted three spaces, despite there being four spaces of indentation. This is because offset points to the start of the line, but the caret points to the character _before_ offset. So offset=0 and offset=1 are both represented by pointing at the first character on the line. This is related to Serhiy’s bug with inserting “1;”. In some cases (e.g. the line “1 +”), the offset is the string index _after_ the error. But in the case of “1;1 + 1 = 2”, offset is the index where the error (statement) begins.
Both pieces of indentation stripping code were added in revision 27f04d714ecb (year 2001). It is strange that only one stripped form feeds though. The column offset was only added later (revision 861c35cef7aa). So Python 2 will have some of its SyntaxError.text lines stripped of indentation, but it does not matter so much because SyntaxError.offset is not set in those cases.
|
msg282890 - (view) |
Author: Martin Panter (martin.panter) * |
Date: 2016-12-11 00:29 |
Patch v6 strips the form feed, and adds an extra test.
|
msg282905 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-12-11 05:48 |
LGTM.
|
msg283592 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2016-12-19 06:47 |
New changeset 35b50c26f780 by Martin Panter in branch '3.5':
Issue #25677: Correct syntax error caret for indented blocks.
https://hg.python.org/cpython/rev/35b50c26f780
New changeset d4effdd699de by Martin Panter in branch '3.6':
Issue #25677: Merge SyntaxError caret positioning from 3.5
https://hg.python.org/cpython/rev/d4effdd699de
New changeset 2342726ea0c0 by Martin Panter in branch 'default':
Issue #25677: Merge SyntaxError caret positioning from 3.6
https://hg.python.org/cpython/rev/2342726ea0c0
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:24 | admin | set | github: 69863 |
2017-03-31 16:36:35 | dstufft | set | pull_requests:
+ pull_request1080 |
2016-12-19 20:29:54 | martin.panter | set | status: open -> closed resolution: fixed stage: commit review -> resolved |
2016-12-19 06:47:07 | python-dev | set | nosy:
+ python-dev messages:
+ msg283592
|
2016-12-11 05:48:02 | serhiy.storchaka | set | assignee: serhiy.storchaka -> martin.panter messages:
+ msg282905 |
2016-12-11 00:29:35 | martin.panter | set | files:
+ cpython25677.v6.patch
messages:
+ msg282890 |
2016-12-10 10:36:00 | martin.panter | set | messages:
+ msg282845 |
2016-11-07 18:13:07 | serhiy.storchaka | set | messages:
+ msg280223 |
2016-11-07 17:52:51 | berker.peksag | set | messages:
+ msg280220 |
2016-11-07 17:43:40 | vstinner | set | nosy:
- vstinner
|
2016-11-07 17:39:35 | serhiy.storchaka | set | assignee: serhiy.storchaka messages:
+ msg280215 |
2016-11-07 16:26:43 | mystor | set | messages:
+ msg280207 |
2016-11-07 16:22:48 | berker.peksag | set | messages:
+ msg280206 |
2016-11-07 16:03:40 | mystor | set | messages:
+ msg280205 |
2016-11-03 09:17:53 | serhiy.storchaka | set | stage: patch review -> commit review messages:
+ msg279978 versions:
+ Python 3.7 |
2016-11-01 21:56:58 | yan12125 | set | nosy:
+ yan12125
|
2016-11-01 21:51:57 | serhiy.storchaka | link | issue28582 superseder |
2016-01-09 21:47:02 | mystor | set | files:
+ cpython25677.patch
messages:
+ msg257870 |
2016-01-09 21:39:21 | martin.panter | set | messages:
+ msg257864 |
2016-01-09 20:18:41 | r.david.murray | set | nosy:
- r.david.murray
|
2016-01-09 00:17:09 | mystor | set | files:
+ cpython25677.patch
messages:
+ msg257789 |
2015-12-10 16:33:43 | berker.peksag | set | nosy:
+ berker.peksag messages:
+ msg256178
|
2015-11-26 21:33:31 | martin.panter | set | stage: test needed -> patch review |
2015-11-26 21:32:47 | martin.panter | set | messages:
+ msg255435 |
2015-11-26 14:45:43 | mystor | set | files:
+ cpython25677.patch
messages:
+ msg255418 |
2015-11-26 01:18:25 | martin.panter | set | nosy:
+ martin.panter messages:
+ msg255397
|
2015-11-25 18:19:22 | r.david.murray | set | messages:
+ msg255376 |
2015-11-25 17:44:49 | mystor | set | files:
+ cpython25677.patch
messages:
+ msg255370 |
2015-11-20 18:26:46 | serhiy.storchaka | set | messages:
+ msg255007 stage: patch review -> test needed |
2015-11-20 17:05:08 | r.david.murray | set | stage: needs patch -> patch review |
2015-11-20 17:04:24 | r.david.murray | set | nosy:
+ vstinner, serhiy.storchaka
|
2015-11-20 16:20:39 | mystor | set | files:
+ cpython25677.patch keywords:
+ patch messages:
+ msg254992
|
2015-11-20 14:44:12 | r.david.murray | set | keywords:
+ easy
messages:
+ msg254981 stage: needs patch |
2015-11-20 14:40:32 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg254979
|
2015-11-20 05:50:56 | mystor | create | |