Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async/await parser issues #68807

Closed
skrah mannequin opened this issue Jul 12, 2015 · 13 comments
Closed

async/await parser issues #68807

skrah mannequin opened this issue Jul 12, 2015 · 13 comments
Assignees

Comments

@skrah
Copy link
Mannequin

skrah mannequin commented Jul 12, 2015

BPO 24619
Nosy @gvanrossum, @ncoghlan, @vstinner, @benjaminp, @skrah, @vadmium, @1st1
Files
  • issue24619.1.patch
  • issue24619.2.patch
  • issue24619.3.patch
  • issue24619.4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/1st1'
    closed_at = <Date 2015-07-25.21:07:11.600>
    created_at = <Date 2015-07-12.18:49:13.238>
    labels = ['release-blocker']
    title = 'async/await parser issues'
    updated_at = <Date 2015-07-26.09:37:11.259>
    user = 'https://github.com/skrah'

    bugs.python.org fields:

    activity = <Date 2015-07-26.09:37:11.259>
    actor = 'yselivanov'
    assignee = 'yselivanov'
    closed = True
    closed_date = <Date 2015-07-25.21:07:11.600>
    closer = 'larry'
    components = []
    creation = <Date 2015-07-12.18:49:13.238>
    creator = 'skrah'
    dependencies = []
    files = ['39964', '39965', '39984', '39991']
    hgrepos = []
    issue_num = 24619
    keywords = ['patch']
    message_count = 13.0
    messages = ['246660', '247034', '247035', '247037', '247038', '247078', '247106', '247107', '247110', '247123', '247152', '247175', '247188']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'ncoghlan', 'vstinner', 'benjamin.peterson', 'skrah', 'python-dev', 'martin.panter', 'yselivanov']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue24619'
    versions = ['Python 3.5', 'Python 3.6']

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Jul 12, 2015

    If I understand the reference manual correctly, these should probably
    be rejected by the compiler:

    >>> async def f():
    ...     def g(): pass
    ...     async = 10
    ... 
    >>> async def f():
    ...     def async():
    ...         pass
    ... 
    >>> async def f(): async = 10
    ... 
    >>> async def f():
    ...     def await(): pass
    ... 
    >>> 

    And this should perhaps be accepted:

    >>> async def f():
    ...     return lambda await: await
      File "<stdin>", line 2
        return lambda await: await
                          ^
    SyntaxError: invalid syntax

    This, too:

    >>> async def f():
    ...     async def g(): pass
    ...     await z
      File "<stdin>", line 3
        await z
              ^
    SyntaxError: invalid syntax

    @1st1
    Copy link
    Member

    1st1 commented Jul 21, 2015

    Sorry for not responding earlier, I'm on vacation and don't check email too often.

    While investigating what can be done in tokenizer.c to fix some of the bugs that Stefan pointed out, I discovered a simpler approach: instead of checking what kind of function we have on top of the stack, I now have a counter of nested async functions. This aligns nicely with what Nick said on python-dev once: we essentially treat 'async def' as a future import, everything inside it parses differently.

    In other words, before this patch:

       async def foo():
           def bar():
               async = 1

    was legal, now it's a syntax error, as 'async' will be parsed as 'ASYNC' token.

    This simplifies the implementation of tokenizer.c hacks, fixes all of the bugs that Stefan posted here, *and* enables one-line 'async' functions:

       async def foo(): await bar() # legal with the patch

    It will also help with migrating the code to Python 3.7 when async/await will become proper keywords.

    And, with this patch we can effectively remove the shortcomings section from the PEP-492 (https://www.python.org/dev/peps/pep-0492/#transition-period-shortcomings).

    Please review the attached patch, it would be great if we can commit it before 3.5beta4.

    @1st1 1st1 self-assigned this Jul 21, 2015
    @gvanrossum
    Copy link
    Member

    Haven't reviewed the patch, but this approach sounds great (in fact I had
    assumed you were doing this already, and I was a bit surprised by some of
    the problems you encountered :-).

    @vadmium
    Copy link
    Member

    vadmium commented Jul 21, 2015

    Good news :) I guess this means we can also remove the sentence I added at <https://docs.python.org/3.5/reference/compound_stmts.html#coroutine-function-definition\>.

    @1st1
    Copy link
    Member

    1st1 commented Jul 21, 2015

    An updated patch is attached. I had to implement a little bit more sophisticated tracking of one-line functions to fix parsing of things like

        def foo():
            async def f(): pass
            async def f(): pass
            async = 1

    I hope that test_coroutine.py now covers all possible legal and illegal async/await syntax.

    Haven't reviewed the patch, but this approach sounds great (in fact I had
    assumed you were doing this already, and I was a bit surprised by some of
    the problems you encountered :-).

    Yes, I'm a bit surprised myself ;) I guess one of the reasons why I tried to do more in tokenizer is that at the time of me hacking the tokenizer, compile.c wasn't quite ready (specifically, catching async for/async with/await outside of async functions).

    Good news :) I guess this means we can also remove the sentence I added at [..]

    Right!

    @ncoghlan
    Copy link
    Contributor

    Patch & test cases look good to me.

    I'm so used to thinking of the tokenisation phase as a linear token stream
    that it never occurred to me to just count the function nesting directly to
    determine if the "async def" tokenisation rules are in effect - it's a very
    nice simplifying improvement.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 22, 2015

    New changeset 9da080ecadb2 by Yury Selivanov in branch '3.5':
    Issue bpo-24619: New approach for tokenizing async/await.
    https://hg.python.org/cpython/rev/9da080ecadb2

    New changeset 987b72921a0c by Yury Selivanov in branch 'default':
    Merge 3.5 (Issue bpo-24619)
    https://hg.python.org/cpython/rev/987b72921a0c

    @1st1
    Copy link
    Member

    1st1 commented Jul 22, 2015

    Thanks Nick! I've committed the patch with a few more unittests and a couple of additional comments in tokenizer.(c|h).

    @1st1 1st1 closed this as completed Jul 22, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 22, 2015

    New changeset e4e01488afff by Yury Selivanov in branch '3.5':
    Issue bpo-24619: More tests; fix nits in compiler.c
    https://hg.python.org/cpython/rev/e4e01488afff

    New changeset 6f4e0c462daf by Yury Selivanov in branch 'default':
    Merge 3.5 (Issue bpo-24619)
    https://hg.python.org/cpython/rev/6f4e0c462daf

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Jul 22, 2015

    This is a very nice solution!

    I'm just curious if the 'ctx' is still needed: It looks like
    the outermost "async def" dominates all further nested scopes
    w.r.t the tokenizer mode, no matter whether they're "def" or
    "async def" scopes.

    IOW, a single indent_level variable that follows all INDENTs/DEDENTs
    once the outermost "async def" scope is entered might be sufficient.

    [This is in no way urgent, please do not feel obliged to respond
    during your holiday.]

    @1st1
    Copy link
    Member

    1st1 commented Jul 22, 2015

    I'm just curious if the 'ctx' is still needed: It looks like
    the outermost "async def" dominates all further nested scopes
    w.r.t the tokenizer mode, no matter whether they're "def" or
    "async def" scopes.

    This is a wonderful idea, that's the way it should be done in
    both tokenizer.c and tokenize.py.

    Please see the new patch. I love the simplicity of it, no
    more stacks or hard to follow code.

    @1st1 1st1 reopened this Jul 22, 2015
    @1st1
    Copy link
    Member

    1st1 commented Jul 23, 2015

    Updated patch attached (rebased; minor optimization). I'll commit this patch in a few hours to make sure it lands in 3.5beta4.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 23, 2015

    New changeset d03f86e41066 by Yury Selivanov in branch '3.5':
    Issue bpo-24619: Simplify async/await tokenization.
    https://hg.python.org/cpython/rev/d03f86e41066

    New changeset 3f8048926690 by Yury Selivanov in branch 'default':
    Merge 3.5 (Issue bpo-24619)
    https://hg.python.org/cpython/rev/3f8048926690

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants