classification
Title: Inconsistency with lineno and col_offset info when parsing elif
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: lys.nikolaou, miss-islington, pablogsal
Priority: normal Keywords: patch

Created on 2019-12-12 20:15 by lys.nikolaou, last changed 2019-12-14 10:55 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17582 merged lys.nikolaou, 2019-12-12 20:21
PR 17583 closed miss-islington, 2019-12-12 21:40
PR 17584 merged pablogsal, 2019-12-12 21:51
PR 17585 closed lys.nikolaou, 2019-12-12 21:53
PR 17589 merged miss-islington, 2019-12-13 14:03
PR 17596 merged lys.nikolaou, 2019-12-14 05:22
PR 17600 merged miss-islington, 2019-12-14 10:31
PR 17601 merged pablogsal, 2019-12-14 10:31
Messages (8)
msg358304 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2019-12-12 20:15
While working on pegen, we came across an inconsistency on how line number and column offset info is stored for (el)if nodes. When parsing a very simple if-elif construct like

if a:
    pass
elif b:
    pass

the following parse tree gets generated:

Module(
    body=[
        If(
            test=Name(id="a", ctx=Load(), lineno=1, col_offset=3, end_lineno=1, end_col_offset=4),
            body=[Pass(lineno=2, col_offset=4, end_lineno=2, end_col_offset=8)],
            orelse=[
                If(
                    test=Name(
                        id="b", ctx=Load(), lineno=3, col_offset=5, end_lineno=3, end_col_offset=6
                    ),
                    body=[Pass(lineno=4, col_offset=4, end_lineno=4, end_col_offset=8)],
                    orelse=[],
                    lineno=3,
                    col_offset=5,
                    end_lineno=4,
                    end_col_offset=8,
                )
            ],
            lineno=1,
            col_offset=0,
            end_lineno=4,
            end_col_offset=8,
        )
    ],
    type_ignores=[],
)

There is the inconsistency that the column offset for the if statement is 0, thus the if statement starts with the keyword if, whereas the column offset for elif if 5, which means that the elif keyword is skipped.

As Guido suggests over at https://github.com/gvanrossum/pegen/issues/107#issuecomment-565135047 we could very easily change Python/ast.c so that the elif statement start with the elif keyword as well.

I have a PR ready!
msg358306 - (view) Author: miss-islington (miss-islington) Date: 2019-12-12 21:40
New changeset 025a602af7ee284d8db6955c26016f3f27d35536 by Miss Islington (bot) (Lysandros Nikolaou) in branch 'master':
bpo-39031: Include elif keyword when producing lineno/col-offset info for if_stmt (GH-17582)
https://github.com/python/cpython/commit/025a602af7ee284d8db6955c26016f3f27d35536
msg358308 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-12 22:12
Hmmm, there is some problem with the CI and the 3.7 branch. Seems like Travis CI is giving some problems again.... I will investigate or maybe ask Brett to make the check not required again (we still have Azure Pipelines, testing the same thing).
msg358309 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-12 22:12
The same thing happens with 3.8.....
msg358332 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-13 14:04
New changeset 0ed45d0cbfc7579dfc5527c19aa6e4bb696db2e0 by Pablo Galindo in branch '3.7':
[3.7] bpo-39031: Include elif keyword when producing lineno/col-offset info for if_stmt (GH-17582) (#17584)
https://github.com/python/cpython/commit/0ed45d0cbfc7579dfc5527c19aa6e4bb696db2e0
msg358338 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-13 16:22
New changeset 3b18b17efcee6f968cf85c543254b3611311e9f4 by Pablo Galindo (Miss Islington (bot)) in branch '3.8':
bpo-39031: Include elif keyword when producing lineno/col-offset info for if_stmt (GH-17582) (GH-17589)
https://github.com/python/cpython/commit/3b18b17efcee6f968cf85c543254b3611311e9f4
msg358376 - (view) Author: Lysandros Nikolaou (lys.nikolaou) * (Python committer) Date: 2019-12-14 04:53
There was a bug in my last PR, hopefully I will get a fix some time later today.

The bug is as follows: I only updated the asdl_seq_SET call for an elif without an else, if an else is included then the behavior is like before.

After my last PR it looks like this, parsing

2:if a:
3:    pass
4:elif b:
5:    pass

outputs the following AST:

Module(
    body=[
        If(
            test=Name(id="a", ctx=Load(), lineno=2, col_offset=3, end_lineno=2, end_col_offset=4),
            body=[Pass(lineno=3, col_offset=4, end_lineno=3, end_col_offset=8)],
            orelse=[
                If(
                    test=Name(
                        id="b", ctx=Load(), lineno=4, col_offset=5, end_lineno=4, end_col_offset=6
                    ),
                    body=[Pass(lineno=5, col_offset=4, end_lineno=5, end_col_offset=8)],
                    orelse=[],
                    lineno=4,
                    col_offset=0,
                    end_lineno=5,
                    end_col_offset=8,
                )
            ],
            lineno=2,
            col_offset=0,
            end_lineno=5,
            end_col_offset=8,
        )
    ],
    type_ignores=[],
)

On the other hand parsing

2:if a:
3:    pass
4:elif b:
5:    pass
6:else:
7:    pass

outputs

Module(
    body=[
        If(
            test=Name(
                id="a",
                ctx=Load(),
                lineno=2,
                col_offset=3,
                end_lineno=2,
                end_col_offset=4,
            ),
            body=[Pass(lineno=3, col_offset=4, end_lineno=3, end_col_offset=8)],
            orelse=[
                If(
                    test=Name(
                        id="b",
                        ctx=Load(),
                        lineno=4,
                        col_offset=5,
                        end_lineno=4,
                        end_col_offset=6,
                    ),
                    body=[Pass(lineno=5, col_offset=4, end_lineno=5, end_col_offset=8)],
                    orelse=[
                        Pass(lineno=7, col_offset=4, end_lineno=7, end_col_offset=8)
                    ],
                    lineno=4,
                    col_offset=5,
                    end_lineno=7,
                    end_col_offset=8,
                )
            ],
            lineno=2,
            col_offset=0,
            end_lineno=7,
            end_col_offset=8,
        )
    ],
    type_ignores=[],
)
msg358387 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-14 10:55
Thanks, Lysandros for the quick fix!
History
Date User Action Args
2019-12-14 10:55:43pablogsalsetstatus: open -> closed
resolution: fixed
messages: + msg358387
2019-12-14 10:36:22pablogsalsetstatus: closed -> open
resolution: fixed -> (no value)
2019-12-14 10:31:52pablogsalsetpull_requests: + pull_request17070
2019-12-14 10:31:22miss-islingtonsetpull_requests: + pull_request17069
2019-12-14 05:22:45lys.nikolaousetpull_requests: + pull_request17068
2019-12-14 04:53:45lys.nikolaousetmessages: + msg358376
2019-12-13 20:30:27gvanrossumsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-12-13 16:22:24pablogsalsetmessages: + msg358338
2019-12-13 14:04:18pablogsalsetmessages: + msg358332
2019-12-13 14:03:47miss-islingtonsetpull_requests: + pull_request17061
2019-12-12 22:12:41pablogsalsetmessages: + msg358309
2019-12-12 22:12:06pablogsalsetmessages: + msg358308
2019-12-12 21:53:23lys.nikolaousetpull_requests: + pull_request17059
2019-12-12 21:51:49pablogsalsetpull_requests: + pull_request17058
2019-12-12 21:40:37miss-islingtonsetpull_requests: + pull_request17057
2019-12-12 21:40:27miss-islingtonsetnosy: + miss-islington
messages: + msg358306
2019-12-12 21:14:46pablogsalsetcomponents: + Interpreter Core
2019-12-12 21:14:17pablogsalsetversions: + Python 3.7
2019-12-12 21:14:02pablogsalsetnosy: + pablogsal
2019-12-12 20:21:36lys.nikolaousetkeywords: + patch
stage: patch review
pull_requests: + pull_request17056
2019-12-12 20:16:24lys.nikolaousettitle: Inconsistent lineno and col_offset info when parsing elif -> Inconsistency with lineno and col_offset info when parsing elif
2019-12-12 20:15:07lys.nikolaoucreate