This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ast Str type does not annotate the string type when it parses a python document
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, brett.cannon, eric.smith, georg.brandl, myronww, ncoghlan, r.david.murray, vstinner
Priority: normal Keywords:

Created on 2015-12-16 21:09 by myronww, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (14)
msg256529 - (view) Author: Myron Walker (myronww) * Date: 2015-12-16 21:09
The 'ast' module does not indicate the type of string, ''' or '"' or '"""', that it has encountered when it parses a python document.

This prevents accurate reproduction of the original parsed document by a writer walking over an instance of a abstract syntax tree.
msg256530 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-12-16 21:18
Two things. One, this won't change in Python 2.7 due to compatibility, so this only applies to Python 3.6. Two, the AST has not been designed to support round-trip syntax transpiling, e.g. there is no way to get back comments in the source code.

Because the AST is not meant to round-trip on source code I'm going to close this as "not a bug" since I don't see any benefit in supporting this even if we gain comment nodes in the AST.
msg256531 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-16 21:20
Unlike the tokenizer, I don't think the AST module makes any guarantee about being able to reproduce the original source.  This might be a reasonable enhancement request, though.
msg256532 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-16 21:22
Oh, that was weird.  I got a weird error message from roundup.  Must have been because I was posting at the same time as Brett.  I'll defer to his decision on this.
msg256538 - (view) Author: Myron Walker (myronww) * Date: 2015-12-16 21:50
The purpose of a syntax tree is to represent the syntax and not a final processed result of processing of syntax.  The current information stored for strings is losing syntax information which seems to defeat the purpose of offering the information in a syntax tree.  I filed a separate bug because it is also combining strings and losing operators for string literals.

   "Hello" + " World"

From the looks of the code, the above would result in one string type with "Hello World" and syntax information associated with the operator would be lost.

And as indicated, string type information is being lost as well.  The user of the AST then has no way of getting the lost syntax information back once it is lost.
msg256544 - (view) Author: Myron Walker (myronww) * Date: 2015-12-16 22:16
I am re-opening this as I believe this is an important issue for work I would like to eventually push into the python core which is python code that recode themselves as declarations or as instance representation.

"I don't see any benefit in supporting this even if we gain comment nodes in the AST."

There would be a huge benefit to having accurate original syntax information of the document and being able to re-write the documents as it would enable self modifying code or software written code paradigms.

    TestCases that can update themselves in various modes, full auto, interactive, etc.

    TestData Pattern Generation object or Data Pattern Generation object that can be easily or efficiently modified and then asked to write themselves back out to files. 

The fact that there are other projects being worked on to provide round trip document to syntax tree and syntax tree to document transformations also indicates that the lack of accurate syntax storage in 'ast' is a problem that
needs to be addressed.

I would prefer to work with the core python modules in order to provide dynamic code modification functionality rather than relying on a third party module as eventually I would like to push this into core python.

If a python object has script behind it, I would like to eventually be able to tell a the object to write itself to a file as a object declaration or as a object instance.

As Declaration:

class SomeObject(SomeBase):
   def some method(self):
       print "blah blah"

As Instance:

   SomeObject()
msg256545 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-16 22:22
Please respect the decisions of the python core developers as to issue status (you can argue with us until we ask you to stop, but let us make the status change if you convince us :)

We aren't rejecting your ideas, just this bug report (as not being a bug).  To pursue your ideas you will want to subscribe to the python-ideas mailing list and begin a discussion there.  This is far too big a topic for the bug tracker.
msg256546 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-12-16 22:25
"There would be a huge benefit to having accurate original syntax information of the document and being able to re-write the documents as it would enable self modifying code or software written code paradigms."

You misunderstood the purpose of the AST tree. "Unparse" AST to retrieve the original source code is out of the scope of the Python AST structure and it's not going to happen. For that, you must use other tool. See for example RedBaron, as I said at:

http://bugs.python.org/issue25886#msg256543

Being able to "unparse" AST to regenerate the Python original source code requires complex operation. You have to store all formating information (spaces, new lines, etc.). You also have to store comments. It's just to give a few examples.

If AST contains formating information, it would be more complex to handle AST.
msg256557 - (view) Author: Myron Walker (myronww) * Date: 2015-12-16 22:53
Why is unparsing out of the scope of the base syntax tree of a dynamic language such as python?

Is that just a personal opinion? Would adding proper storage of syntax information in the AST cause performance issues?

Is there some documentation that defines the scope of the AST module that indicates this is out of scope.

The syntax for numerical constants is properly stored by the AST module.

Example:
    TEST_INT = 1 + 1

Why would the syntax of integer constants be more important than the syntax associated with comments, two string literals, or from multi-line strings.
msg256558 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-12-16 22:59
"Would adding proper storage of syntax information in the AST cause performance issues?"

Exactly. And storing syntax information is useless for 90% of usages of AST. So Python is optimized for the most common cases.

Again, use a different project (like RedBaron) if you need *all* information in the AST.
msg256562 - (view) Author: Myron Walker (myronww) * Date: 2015-12-16 23:05
Would it be prudent to add a Parse flag to the AST module that could provide one of two types of AST's an optimized AST or a complete AST
msg256592 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-12-17 09:45
> Would it be prudent to add a Parse flag to the AST module that could provide one of two types of AST's an optimized AST or a complete AST

Adding syntax ("formatting", call it as you want) info to AST requires
to add new attributes to existing AST nodes and probably add new AST
nodes. It will make the code more complex, not only the code to
produce AST, but also code using AST (static code analyzer, my AST
optimizer project, etc.).

I don't think that it's worth it.
msg256593 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-12-17 09:47
If you consider that we are wrong, please follow the advice of
starting a discussion on python-ideas. This is how Python is
developed, we have a workflow. Proposing ideas on the bug tracker
works in some cases, but AST is really a *core* feature of Python. You
need deep discussions to change the core. To be cristal clear: *if*
anyone is going to change AST, IMHO a PEP is needed. Writing a PEP
requires a specific workflow starting at python-ideas.
msg256607 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2015-12-17 15:17
I agree that the proposed change would require a PEP, and this should be discussed on python-ideas.

I also think there's very little chance such a change would be accepted, but that doesn't mean it's impossible.

I think using a external library is your best bet here. If you want to pursue this on python-ideas, you should discuss why being in the stdlib's ast module would be an improvement over using an external library.
History
Date User Action Args
2022-04-11 14:58:25adminsetgithub: 70073
2015-12-17 15:17:40eric.smithsetstatus: open -> closed
nosy: + eric.smith
messages: + msg256607

2015-12-17 09:47:48vstinnersetmessages: + msg256593
2015-12-17 09:45:50vstinnersetmessages: + msg256592
2015-12-16 23:05:27myronwwsetmessages: + msg256562
2015-12-16 22:59:14vstinnersetmessages: + msg256558
2015-12-16 22:53:56myronwwsetstatus: closed -> open

messages: + msg256557
2015-12-16 22:25:54vstinnersetnosy: + vstinner
messages: + msg256546
2015-12-16 22:22:43r.david.murraysetstatus: open -> closed

messages: + msg256545
2015-12-16 22:16:34myronwwsetstatus: closed -> open

messages: + msg256544
2015-12-16 21:50:51myronwwsetmessages: + msg256538
components: + Interpreter Core, - Extension Modules, Library (Lib)
2015-12-16 21:50:50SilentGhostlinkissue25886 superseder
2015-12-16 21:22:31r.david.murraysetmessages: + msg256532
stage: test needed -> resolved
2015-12-16 21:20:09r.david.murraysetnosy: + r.david.murray
messages: + msg256531
2015-12-16 21:18:53brett.cannonsetstatus: open -> closed
resolution: not a bug
messages: + msg256530

versions: + Python 3.6, - Python 2.7
2015-12-16 21:12:06SilentGhostsetnosy: + brett.cannon, georg.brandl, ncoghlan, benjamin.peterson

components: + Library (Lib)
stage: test needed
2015-12-16 21:09:07myronwwcreate