classification
Title: Allow connecting AST nodes with corresponding source ranges
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.5
process
Status: closed Resolution:
Dependencies: Superseder: Add endline and endcolumn to every AST node
View: 33416
Assigned To: Nosy List: Aivar.Annamaa, Edward.K..Ream, edreamleo, levkivskyi, r.david.murray
Priority: normal Keywords:

Created on 2014-10-12 07:35 by Aivar.Annamaa, last changed 2019-01-14 12:19 by edreamleo. This issue is now closed.

Messages (7)
msg229124 - (view) Author: Aivar Annamaa (Aivar.Annamaa) * Date: 2014-10-12 07:35
Currently lineno and col_offset attributes in AST nodes have confusing roles. According to documentation they indicate the starting position of node's source text but according to recent developments (#16795) they seem to indicate a position most suitable to use in error messages related to that node (rather narrow goal IMO).

Another problem is that currently the AST nodes don't contain any information about the end position of corresponding source text. Therefore it's very difficult to relate nodes with source. One could want to do this for example in advanced graphical debuggers (https://bitbucket.org/plas/thonny)

I propose adding new attributes to AST nodes which indicate the corresponding source range. If you want to keep nodes lightweight by default, then you could also introduce a flag in ast.parse for getting these attributes.

The range could be given either in token indices or in character positions (or both). This probably needs more discussion. (I would vote against pointers to UTF-8 bytes, as is the case with col_offset currently.)
msg229127 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-12 07:59
If it needs some discussion, perhaps it should be raised on python-ideas first.  My reading of #16795 (granted I only skimmed it) was that improving the error messages was a *side effect* of making the position information more accurate (in the context of static analysis tools).

Also, unless you intend to work on a patch this isn't likely to go anywhere :)
msg229138 - (view) Author: Edward K. Ream (Edward.K..Ream) Date: 2014-10-12 10:50
I urge the Python development team to fix this and the related bugs given in the Post Script. The lack of an easy way of associating ast nodes with text ranges in the original sources is arguably the biggest hole in the Python api.

These bugs have immediate, severe, practical consequences for any tool that attempts to regularize (pep 8) or beautify Python code.

Consider the code for PythonTidy:
http://lacusveris.com/PythonTidy/PythonTidy-1.23.python

Every version has had bugs in this area arising from difficult workarounds to the hole in the API.  The entire Comments class is a horror directly related to these issues.

Consider Aivar's workaround to these bugs:
https://bitbucket.org/plas/thonny/src/8cdaa41aca7a5cc0b31618b6f1631d360c488196/src/ast_utils.py?at=default
See the docstring for def fix_ast_problems.  This is an absurdly difficult solution to what should be a trivial problem.

It's impossible to build reliable software using such heroic hacks.  The additional bugs listed below further complicate a nightmarish task.

In short, these bugs are *not* minor little nits.  They are preventing the development of reliable source-code tools.

Edward K. Ream

P.S. Here are the related bugs:

http://bugs.python.org/issue10769
Allow connecting AST nodes with corresponding source ranges

http://bugs.python.org/issue21295
Python 3.4 gives wrong col_offset for Call nodes returned from ast.parse

http://bugs.python.org/issue18374
ast.parse gives wrong position (col_offset) for some BinOp-s

http://bugs.python.org/issue16806
col_offset is -1 and lineno is wrong for multiline string expressions

EKR
msg229152 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-12 14:37
The "python develop team" is a group of volunteers.  What you need to do to move this forward is to work on patches (or polishing patches, in the case of issue 16806) for open issues for which there has been no objection, and build a consensus for fixing the issues for which there has been objection.

Based on the issue 10769 discussion, it sounds like it would be worthwhile to go to python-ideas and advocate for the position that the ast parser should be enhanced for the kind of use cases you are talking about.  It is possible a PEP would be required, but it is also fairly likely that one will not be.

Personally I'm in favor of supporting such use cases, but I'm not involved in maintaining any of the relevant code nor do I have an interest in getting involved at this time.
msg229164 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-12 15:14
OK, reading the followon links just added to issue issue 10769, it seems clear you don't need to advocate for the idea of supporting external tools, there seems to be consensus agreement for that.  So now it is down to specific proposals and patches.  And reviews of patches, those are equally important :)
msg333609 - (view) Author: Ivan Levkivskyi (levkivskyi) * (Python committer) Date: 2019-01-14 11:24
Closed as superseded by https://bugs.python.org/issue33416
msg333614 - (view) Author: Edward K Ream (edreamleo) * Date: 2019-01-14 12:19
On Mon, Jan 14, 2019 at 5:24 AM Ivan Levkivskyi <report@bugs.python.org>
wrote:

Adding endline and endcolumn to every ast node will be a big improvement.

Edward
------------------------------------------------------------------------------------------
Edward K. Ream: edreamleo@gmail.com Leo: http://leoeditor.com/
------------------------------------------------------------------------------------------
History
Date User Action Args
2019-01-14 12:19:22edreamleosetnosy: + edreamleo
messages: + msg333614
2019-01-14 11:24:42levkivskyisetsuperseder: Add endline and endcolumn to every AST node
messages: + msg333609
2019-01-14 11:24:06levkivskyisetstatus: open -> closed
nosy: + levkivskyi

stage: resolved
2014-10-12 15:14:20r.david.murraysetmessages: + msg229164
2014-10-12 14:37:10r.david.murraysetmessages: + msg229152
2014-10-12 10:50:44Edward.K..Reamsetnosy: + Edward.K..Ream
messages: + msg229138
2014-10-12 07:59:28r.david.murraysetnosy: + r.david.murray
messages: + msg229127
2014-10-12 07:35:07Aivar.Annamaacreate