classification
Title: Put back the ability to parse files where async/await aren't keywords
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: ethan smith, gvanrossum, levkivskyi, masthana, miss-islington, msullivan, petr.viktorin, scoder, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2019-02-12 00:23 by gvanrossum, last changed 2020-06-29 16:40 by gvanrossum. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 12086 merged gvanrossum, 2019-02-28 00:44
PR 21021 merged gvanrossum, 2020-06-20 23:58
PR 21022 merged gvanrossum, 2020-06-21 00:19
PR 21023 merged gvanrossum, 2020-06-21 00:25
Messages (26)
msg335277 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-12 00:23
Now that the ast module can be used to parse type comments, there’s one more feature I’d like to lobby for – a flag that modifies the grammar slightly so it resembles an older version of Python (going back to 3.4).

This is used in mypy to decouple the Python version you’re running from the Python version for which you’re checking compatibility (useful when checking code that will be deployed on a system with a different Python version installed). I imagine this would be useful to other linters as well, and the implementation is mostly manipulating whether `async` and `await` are keywords. But if there’s pushback to this part I can live without it – the rest of the work is still useful.

The implementation in typed_ast takes the form of a feature_version flag which is set to the minor Python version to be parsed. Setting it to 4 or lower would make async/await never keywords, setting it to 5 or 6 would make them conditional per PEP 492, and setting it to 7 or higher would make these unconditional keywords. A few minor uses of the same flag also reject f-strings, annotated assignments (PEP 526), and matrix multiply for versions where those don't exist.

But I don't need all of those -- I really just need to be able to select the 3.5/3.6 rules for conditional async/await keywords, since all the other features are purely backwards compatible, whereas with async/await there is legal (and plentiful!) code that uses these as variable or function names that should be supported in 3.5/3.6 mode.

Of course having it be a (minor) version number would still be more future-proof -- if say in 3.10 we add a match expression using some kind of conditional keyword hack, we might have to dust it off.

Note that much of what I'm asking for would effectively roll back https://bugs.python.org/issue30406 -- sorry Jelle! (Though there's more to it -- Serhiy's introduction of Grammar/Token followed, and I would still need to thread some kind of flag all the way from ast.parse() to tokenizer.c.
msg336785 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-02-28 00:24
Well, it's not just rolling back async/await from being keywords. Since 3.7 it's possible to create async generator expressions in non-async functions. This wasn't possible to do with old hacks on the lexer. So if you want to revert the change you risk losing some functionality we enabled in 3.7
msg336786 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-28 00:37
That should still work. The strategy is as follows:

- For Python 3.7 or higher, 'async' and 'await' are *always* recognized.

- For Python 3.5 and 3.6, the hacks from the PEP are used.
msg336788 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-02-28 01:00
Actually, I think we could revert to old lexer hacks in 3.8 if we modify them to treat 'async for' tocken sequence as one meta-tocken.
msg336789 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-28 01:10
But why would we? I already have a working solution. (Literally reverting Jelle's PR won't work anyway, because Serhiy changed how regen-token works.)
msg336790 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-02-28 01:25
> But why would we? I already have a working solution. 

I've heard some complaints that it's hard to migrate to Python 3.7 because async and await are keywords (although I think by now all popular libraries have migrated already), so in case you ever considered to revert that decision I think we could actually make it work.  But for the current PR you're working on we don't need that, you're right.
msg336795 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-28 03:06
And the same tokenizer trick that detects 'async def' could detect 'async for' easily. See https://github.com/python/cpython/pull/12086/files#diff-30b8266a4285de981f8b1b82a8cc6231R1418
msg336800 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-02-28 05:11
> And the same tokenizer trick that detects 'async def' could detect 'async for' easily.

Exactly.  I just never thought that we could support async generator expressions with the kind of hacks in tokenizer we had in 3.5-3.6.  FWIW I still believe that it's better for async & await tokens to be proper keywords, but in case we ever realize we want them to become "soft" keywords we apparently have options.
msg336831 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-02-28 12:32
Would not be simpler to just drop the support of Python versions <3.7 in new MyPy versions?
msg336879 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-28 23:25
> Would not be simpler to just drop the support of Python versions <3.7 in new MyPy versions?

Not really -- mypy has a lot of users who run it over (very) large code bases that can't easily be upgraded. Basically mypy has to support all Python versions that haven't reached their end of life yet. (And it's also important that mypy not be constrained to the same Python version as the target code.)

I suppose we could just stick with the existing typed_ast that supports 3.4 through 3.7, but we actually have a use case for the end_lineno and end_col_offset fields that were just added to ast in 3.8, and in general we'd like to support any future grammar elements and ast features.
msg337243 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-03-05 23:21
Everyone watching, the PR is now ready for review!
msg337432 - (view) Author: miss-islington (miss-islington) Date: 2019-03-07 20:38
New changeset 495da292255b92dd73758fdd0e4c7d27d82b1e57 by Miss Islington (bot) (Guido van Rossum) in branch 'master':
bpo-35975: Support parsing earlier minor versions of Python 3 (GH-12086)
https://github.com/python/cpython/commit/495da292255b92dd73758fdd0e4c7d27d82b1e57
msg343749 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2019-05-28 08:38
It looks like this caused bpo-37072: PyAST_FromNodeObject doesn't ignore flags any more, and when PyNode_Compile passes NULL flags, it crashes.
msg343786 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-05-28 14:42
Sorry for the oversight. Do you need help fixing that?
msg371938 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-20 14:40
I was made aware [1] that the addition of the "cf_feature_version" field broke the backwards compatibility of the "PyRun_StringFlags()" function [2]. Unlike what the docs of "PyCompilerFlags" [3] suggest, the new field is not ignored there but must now be set in order to enable the syntax features of the running Python version.

Background: A Cython user was trying "exec()" on code with f-strings in Py3.8 and got a SyntaxError, because "cf_feature_version" had not been initialised.

I can see two types of existing code being affected:

1) applications or modules that execute user provided Python code (as in the bug report) for which users expect the syntax of the currently running Python runtime to be available.

2) code that used to work in Py3.6/7 using the available Python syntax features and now runs into newly introduced syntax restrictions in Py3.8, such as f-strings no longer being available.

Since Py3.8 has been out for a while with this change, and there were apparently little complaints in addition to issue 37072, should we just update the documentation of "PyCompilerFlags" and "PyRun_StringFlags()" to mention the change there?

[1] https://github.com/cython/cython/issues/3695
[2] https://docs.python.org/3/c-api/veryhigh.html#c.PyRun_StringFlags
[3] https://docs.python.org/3/c-api/veryhigh.html#c.PyCompilerFlags
msg371943 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-20 16:38
But it's a bug, right? The intention for sure is that the cf_feature_version field is only used when the PyCF_ONLY_AST flags is set in cf_flags, per the docs you cite as [3]. Why wouldn't it be possible to fix that?
msg371949 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-20 17:47
I wasn't sure which is better – solve it or leave it. But it seems a) easy to adapt to on user side, and b) solvable in CPython in a way that allows code that wants to adapt to work across 3.8.x versions. Only code that does not get adapted would risk failure in existing 3.8.x versions, and benefit from a fix in future 3.8.x releases.

So, yeah, I think we should fix it then.
msg371974 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-21 00:02
I'm fixing it -- can you verify? I have no easy way to test this. DO you think we should also add a note to the docs that this was broken in 3.8.0-3?
msg371975 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-21 00:05
Hm, the backports have to be done manually, since the 3.8 parser and the 3.10 parser don't overlap, and 3.9 has both versions...
msg371983 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-21 06:36
PRs look good to me. Here is a test that fails for me with the master branch and works with your fixes. It has a slightly odd place in the subinterpreter tests, but I would argue that it's not entirely misplaced there. You would want to know when subinterpreters start rejecting valid code, wouldn't you? :)


diff --git a/Lib/test/test_capi.py b/Lib/test/test_capi.py
index 73e167a0b0..9bed8255e1 100644
--- a/Lib/test/test_capi.py
+++ b/Lib/test/test_capi.py
@@ -627,6 +627,24 @@ class SubinterpreterTest(unittest.TestCase):
             self.assertNotEqual(pickle.load(f), id(sys.modules))
             self.assertNotEqual(pickle.load(f), id(builtins))
 
+    def test_subinterps_recent_language_features(self):
+        r, w = os.pipe()
+        code = """if 1:
+            import pickle
+            with open({:d}, "wb") as f:
+
+                @(lambda x:x)  # Py 3.9
+                def noop(x): return x
+
+                a = (b := f'1{{2}}3') + noop('x')  # Py3.8 (:=) / 3.6 (f'')
+                pickle.dump(dict(a=a, b=b), f)
+            """.format(w)
+
+        with open(r, "rb") as f:
+            ret = support.run_in_subinterp(code)
+            self.assertEqual(ret, 0)
+            self.assertEqual(pickle.load(f), {'a': '123x', 'b': '123'})
+
     def test_mutate_exception(self):
         """
         Exceptions saved in global module state get shared between
diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 808483ebd7..468e25ff9e 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -3468,6 +3468,8 @@ run_in_subinterp(PyObject *self, PyObject *args)
     const char *code;
     int r;
     PyThreadState *substate, *mainstate;
+    /* only initialise 'cflags.cf_flags' to test backwards compatibility */
+    PyCompilerFlags cflags = {0};
 
     if (!PyArg_ParseTuple(args, "s:run_in_subinterp",
                           &code))
@@ -3486,7 +3488,7 @@ run_in_subinterp(PyObject *self, PyObject *args)
         PyErr_SetString(PyExc_RuntimeError, "sub-interpreter creation failed");
         return NULL;
     }
-    r = PyRun_SimpleString(code);
+    r = PyRun_SimpleStringFlags(code, &cflags);
     Py_EndInterpreter(substate);
 
     PyThreadState_Swap(mainstate);
msg372006 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-21 17:08
Okay, I will add that diff (I'll probably add async def to the set of features tested). Do you mind if I just incorporate it in my diff? I could add an acknowledgment in the commit message. Otherwise you could send a separate PR.
msg372019 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-21 20:23
Feel free to use the patch in any way you like.
msg372484 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-28 00:33
New changeset 9d197c7d48147a9ea2f7f7be917f35514a16524b by Guido van Rossum in branch 'master':
bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21021)
https://github.com/python/cpython/commit/9d197c7d48147a9ea2f7f7be917f35514a16524b
msg372485 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-28 00:34
New changeset 2a1ee1d970460047bd88da9638f8c1789431d9ab by Guido van Rossum in branch '3.9':
[3.9] bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21022)
https://github.com/python/cpython/commit/2a1ee1d970460047bd88da9638f8c1789431d9ab
msg372486 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-28 00:35
New changeset e653369e76d7da6bcbcf1f09a141f47fb77df6c3 by Guido van Rossum in branch '3.8':
[3.8] bpo-35975: Only use cf_feature_version if PyCF_ONLY_AST in cf_flags (#21023)
https://github.com/python/cpython/commit/e653369e76d7da6bcbcf1f09a141f47fb77df6c3
msg372602 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-29 16:40
Fixed again. Thanks Stefan for finding this! I apologize for the bug.
History
Date User Action Args
2020-06-29 16:40:15gvanrossumsetstatus: open -> closed

messages: + msg372602
stage: patch review -> resolved
2020-06-29 10:39:53vstinnersetnosy: - vstinner
2020-06-28 00:35:08gvanrossumsetmessages: + msg372486
2020-06-28 00:34:33gvanrossumsetmessages: + msg372485
2020-06-28 00:33:55gvanrossumsetmessages: + msg372484
2020-06-21 20:23:38scodersetmessages: + msg372019
2020-06-21 17:08:19gvanrossumsetmessages: + msg372006
2020-06-21 06:36:56scodersetmessages: + msg371983
2020-06-21 00:25:56gvanrossumsetpull_requests: + pull_request20195
2020-06-21 00:19:32gvanrossumsetstage: resolved -> patch review
pull_requests: + pull_request20194
2020-06-21 00:05:40gvanrossumsetmessages: + msg371975
2020-06-21 00:02:23gvanrossumsetmessages: + msg371974
2020-06-21 00:01:24gvanrossumsetstatus: closed -> open
2020-06-20 23:58:31gvanrossumsetpull_requests: + pull_request20193
2020-06-20 17:47:55scodersetstatus: open -> closed

messages: + msg371949
2020-06-20 16:38:19gvanrossumsetstatus: closed -> open

messages: + msg371943
2020-06-20 14:40:43scodersetnosy: + scoder
messages: + msg371938
2019-05-28 14:42:47gvanrossumsetmessages: + msg343786
2019-05-28 08:38:52petr.viktorinsetnosy: + petr.viktorin
messages: + msg343749
2019-03-07 22:58:25gvanrossumsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-03-07 20:38:15miss-islingtonsetnosy: + miss-islington
messages: + msg337432
2019-03-05 23:21:09gvanrossumsetmessages: + msg337243
2019-03-01 15:33:06vstinnersetnosy: + vstinner
2019-02-28 23:25:04gvanrossumsetmessages: + msg336879
2019-02-28 12:32:17serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg336831
2019-02-28 05:11:37yselivanovsetmessages: + msg336800
2019-02-28 03:06:04gvanrossumsetmessages: + msg336795
2019-02-28 01:25:45yselivanovsetmessages: + msg336790
2019-02-28 01:10:22gvanrossumsetmessages: + msg336789
2019-02-28 01:00:29yselivanovsetmessages: + msg336788
2019-02-28 00:44:52gvanrossumsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request12096
2019-02-28 00:37:38gvanrossumsetmessages: + msg336786
2019-02-28 00:24:20yselivanovsetnosy: + yselivanov
messages: + msg336785
2019-02-26 20:59:18msullivansetnosy: + msullivan
2019-02-16 16:30:59masthanasetnosy: + masthana
2019-02-13 02:59:14gvanrossumsetassignee: gvanrossum
stage: needs patch
type: behavior
components: + Interpreter Core
versions: + Python 3.8
2019-02-12 01:24:40ethan smithsetnosy: + ethan smith
2019-02-12 00:35:49levkivskyisetnosy: + levkivskyi
2019-02-12 00:23:38gvanrossumcreate