This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lukasz.langa
Recipients benjamin.peterson, gregory.p.smith, gvanrossum, lukasz.langa, serhiy.storchaka
Date 2018-04-23.08:01:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1524470481.51.0.682650639539.issue33337@psf.upfronthosting.co.za>
In-reply-to
Content
> These modification are applied only before bytecodecode generation. The AST presented to user is not modified.

This bit me when implementing PEP 563 but I was then on the compile path, right.  Still, the latest docstring folding would qualify as an example here, too, no?


> Is this a problem? 2.7 is a dead end, its support will be ended in less than 2 years. Even 3.6 will be moved to a security only fixes stage short time after releasing 3.8.

Yes, it is a problem.  We will support Python 2 until 2020 but people will be running Python 2 code for a decade *at least*.  We need to provide those people a way to move their code forward.  Static analysis tools like formatters, linters, type checkers, or 2to3-style translators, are all soon going to run on Python 3.  It would be a shame if those programs were barred from helping users that are still struggling on Python 2.

A closer example is async/await.  It would be a shame if running on Python 3.7 meant you can't write a tool that renames (or even just *detects*) invalid uses of async/await.  I firmly believe that the version of the runtime should be indepedent of the version it's able to analyze.


> I'm in favor of updating Lib/lib2to3/pgen2/tokenize.py, but I don't understand why Lib/tokenize.py should parse 2.7.

Hopefully I sufficiently explained that above.


> I'm in favor of reimplementing pgen in Python if this will simplify the code and the building process. Python code is simpler than C code, this code is not performance critical, and in any case we need an external Python when modify grammar of bytecode.

Well, I didn't think about abandoning pgen.  I admit that's mostly because my knee-jerk reaction was that it would be too slow.  But you're right that this is not performance critical because every `pip install` runs `compileall`.

I guess we could parse in "strict" mode for Python itself but allow for multiple grammars for standard library use (as I explained in the reply to Guido).  And this would most likely give us opportunity to iterate on grammar improvements in the future.

And yet, I'm cautious here.  Even ignoring performance, that sounds like a more ambitious task from what I'm attempting.  Unless I find partners in crime for this, I wouldn't attempt that.  And I would need thumbs up from the BDFL and performance-wary contributors.


> For what purposes the CST is needed besides 2to3?

Anywhere where you need the full view of the code which includes non-semantic pieces.  Those include:
- whitespace;
- comments;
- parentheses;
- commas;
- strings prefixes.

The main use case is linters and refactoring tools.  For example mypy is using a modified AST to support type comments.  YAPF and Black are based on lib2to3 because as formatters they can't lose comments, string prefixes, and organizational parentheses either.  JEDI is using Parso, a lib2to3 fork, for similar reasons.
History
Date User Action Args
2018-04-23 08:01:21lukasz.langasetrecipients: + lukasz.langa, gvanrossum, gregory.p.smith, benjamin.peterson, serhiy.storchaka
2018-04-23 08:01:21lukasz.langasetmessageid: <1524470481.51.0.682650639539.issue33337@psf.upfronthosting.co.za>
2018-04-23 08:01:21lukasz.langalinkissue33337 messages
2018-04-23 08:01:20lukasz.langacreate