Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow ambiguous syntax f(x for x in [1],) #76193

Closed
serhiy-storchaka opened this issue Nov 12, 2017 · 24 comments
Closed

Disallow ambiguous syntax f(x for x in [1],) #76193

serhiy-storchaka opened this issue Nov 12, 2017 · 24 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 32012
Nosy @gvanrossum, @brettcannon, @ncoghlan, @benjaminp, @serhiy-storchaka, @cryvate, @miss-islington
PRs
  • bpo-32012: Disallow trailing comma after genexpr without parenthesis. #4382
  • bpo-32023: Disallow genexprs without parenthesis in class definitions. #4400
  • [3.7] Revert "closes bpo-27494: Fix 2to3 handling of trailing comma after a generator expression (GH-3771)" (GH-8241) #8580
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-11-15.08:16:39.988>
    created_at = <Date 2017-11-12.22:29:02.776>
    labels = ['interpreter-core', 'type-feature', '3.7']
    title = 'Disallow ambiguous syntax f(x for x in [1],)'
    updated_at = <Date 2018-07-31.06:52:51.418>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2018-07-31.06:52:51.418>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-11-15.08:16:39.988>
    closer = 'ncoghlan'
    components = ['Interpreter Core']
    creation = <Date 2017-11-12.22:29:02.776>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32012
    keywords = ['patch']
    message_count = 24.0
    messages = ['306132', '306145', '306149', '306159', '306160', '306161', '306179', '306185', '306198', '306199', '306200', '306204', '306206', '306207', '306208', '306209', '306211', '306212', '306213', '306233', '306255', '306256', '322730', '322733']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'brett.cannon', 'ncoghlan', 'benjamin.peterson', 'serhiy.storchaka', 'cryvate', 'miss-islington']
    pr_nums = ['4382', '4400', '8580']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue32012'
    versions = ['Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    The syntax f(x for x in [1],) is ambiguous and reasons that allow it (omitting parenthesis in generator expression and using trailing comma in call expression) are not applicable in this case. Rationales see on Python-Dev: https://mail.python.org/pipermail/python-dev/2017-November/150481.html.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Nov 12, 2017
    @serhiy-storchaka
    Copy link
    Member Author

    PR 4382 doesn't change the grammar, it changes only checks in the CST to AST transformer. Maybe it would be better to change the grammar. Currently it doesn't match the language specification and allows the following constructions:

    @deco(x for x in [1])
    def f(): ...
    
        class C(x for x in [1]): ...

    And I think the part of bpo-27494 should be reverted.

    @serhiy-storchaka
    Copy link
    Member Author

    It is easy to forbid the above cases, but I don't know what error message is appropriate. General "invalid syntax"?

    @Cryvate
    Copy link
    Mannequin

    Cryvate mannequin commented Nov 13, 2017

    Currently,

    Class C(*some_classes): ...
    

    works 'as expected' and is within grammar and language specification whereas

    Class C(x for x in [object]): ...
    

    does not work but does not cause a syntax error. I can see a use case for both in dynamic class factories. I was going to do this, but was thwarted by another issue (doc cannot be assigned after creation, nor can it be defined as anything but a pure string, any work around or reason that is the case? Not true for e.g. functions).

    I think having one of these within the language specification and not the other is odd.

    @Cryvate
    Copy link
    Mannequin

    Cryvate mannequin commented Nov 13, 2017

    [As a follow-on, should I open a new issue/discuss on python-dev? Willing to help out with a solution on way or another! I know https://en.wikipedia.org/wiki/Wikipedia:Chesterton%27s_fence, "In my head" <> "Should be the case" etc. very much applies.]

    In my head

        @...
        def foo(): pass

    is equivalent to

        def _foo(): pass
        foo = ...()
        del _foo

    However the following shows this is not the case:

        @0
        def foo(): pass

    throws a syntax error, whereas

        def _foo(): pass
        foo = 0(_foo)

    throws a type error. This might seem silly, but it is still unexpected.

    https://docs.python.org/3/reference/compound_stmts.html#grammar-token-decorator has

    decorator ::= "@" dotted_name ["(" [argument_list [","]] ")"] NEWLINE
    

    which in my head is

    decorator ::= "@" atom NEWLINE
    

    Similarly for classes: https://docs.python.org/3/reference/compound_stmts.html#class-definitions

    inheritance ::= "(" [argument_list] ")"
    

    which allows for keyword arguments (does that make any sense!?). In my head it is (compare with call: https://docs.python.org/3/reference/expressions.html#calls)

    inheritance ::= "(" [positional_arguments [","] | comprehension] ")"
    

    [Tangentially related, this is how I originally got onto the mailing lists, my unhappiness with the definition of the for statement (https://docs.python.org/3/reference/compound_stmts.html#the-for-statement):

    for_stmt ::= "for" target_list "in" expression_list ":" suite ["else" ":" suite]
    

    Which I would expect to be:

    for_stmt ::= comp_for ":" suite ["else" ":" suite]
    

    so you could e.g. have if statements.
    ]

    @serhiy-storchaka
    Copy link
    Member Author

    I think this issue is not the best way for answering your question, but I will make a try.

    The fact that "class C(x for x in [object]): ..." does not cause a syntax error is a bug. This issue fixes it. The fact that corrected "class C((x for x in [object])): ..." doesn't work is expected, because a generator instance is not a class.

    The equivalence between a decorator expression and explicit calling a decorator function is true only in one direction and only for valid Python syntax. Saying about equivalence of syntactically incorrect Python code doesn't make sense.

    Yes, an inheritance list can contain keyword arguments. They are passed to a metaclass constructor as well as positional arguments.

    The syntaxes of the for statement and comprehensions are different.

    @ncoghlan
    Copy link
    Contributor

    In a function call, f(x for x in iterable) is roughly equivalent to f(iter(iterable)), not f(*iterable) (the genexp based equivalent of the latter would be ``f(*(x for x in iterable))`).

    Thus the base class list is no different from any other argument list in this case - it's just that generator objects aren't valid base classes.

    Getting back on topic for this particular bug fix though: as noted in my last PR review, I think the latest version goes too far by disallowing `@deco(x for x in iterable)` and `class C(x for x in iterable):`. While semantically questionable, there's nothing *syntactically* invalid about those - they pass a single generator expression, and that generator expression is correctly surrounded by parentheses. There's no more reason to prohibit a genexp in either of those situations at compile time than there is to prohibit a list comprehension.

    @serhiy-storchaka
    Copy link
    Member Author

    The problem with these constructions is that they are not allowed by the Python language specification. It should be explicitly changed for allowing them. And this change should be accepted by Guido.

    @ncoghlan
    Copy link
    Contributor

    I created https://bugs.python.org/issue32023 to explicitly cover the base class list case, and after checking the language spec, I agree that case should be a syntax error.

    However, @deco(x for x in []) should not be a syntax error, as:

    @serhiy-storchaka
    Copy link
    Member Author

    No, it doesn't match the "@dotted_name(arg_list)" pattern.

    decorator: "@" dotted_name ["(" [argument_list [","]] ")"] NEWLINE
    call: primary "(" [argument_list [","] | comprehension] ")"
    argument_list: positional_arguments ["," starred_and_keywords]
    : ["," keywords_arguments]
    : | starred_and_keywords ["," keywords_arguments]
    : | keywords_arguments

    The call syntax contains a special case for generator expression. The decorator expression syntax dosn't contain it. You should change the grammar rule to

    decorator: "@" dotted_name ["(" [argument_list [","] | comprehension] ")"] NEWLINE

    for supporting this syntax. Please open a separate topic on Python-Dev for discussing this language change.

    @Cryvate
    Copy link
    Mannequin

    Cryvate mannequin commented Nov 14, 2017

    I think this showcases how difficult it is to get this right, requires carefully reading the EBNF language spec, not just the text, and the behaviour is unexpected.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 14, 2017

    I think "ambiguous" is not the right word. If a single argument
    can be a non-parenthesized generator and all arguments can be
    followed by a trailing comma, it's clear.

    The language spec is often behind in my experience.

    @ncoghlan
    Copy link
    Contributor

    OK, I've filed https://bugs.python.org/issue32024 to cover the decorator syntax discrepancy.

    So I'd still prefer to restrict the patch for *this* issue to just the genuinely ambiguous case, and leave the unambiguous-but-inconsistent-with-the-language-spec cases to their respective issues.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 14, 2017

    I would prefer to do nothing about the subject of this issue. I still
    don't see any ambiguity, except in a very broad colloquial sense.

    Why introduce another special case?

    @ncoghlan
    Copy link
    Contributor

    If limited to the original scope, this isn't a new special case, it's fixing a bug in the implementation of the existing special case (where it's ignoring the trailing comma when it shouldn't be).

    If it hadn't been for the scope creep to include a couple of other cases where the implementation is arguably more permissive than the language spec says it should, it would have already been merged.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 14, 2017

    On Tue, Nov 14, 2017 at 01:31:52PM +0000, Nick Coghlan wrote:

    If limited to the original scope, this isn't a new special case, it's fixing a bug in the implementation of the existing special case (where it's ignoring the trailing comma when it shouldn't be).

    This ignores the trailing comma:

        f([1,2,3],)

    And this:

        f(x for x in [1,2,3],)

    Seems logical to me.

    Do you want to allow the 1,2 to be read as a tuple?

       f(x for x in 1,2)

    @serhiy-storchaka
    Copy link
    Member Author

    I would prefer to fix all related cases in one issue, for having all examples in one place and having only one reference. All this cases are caused by the limitation of the parser used in CPython, and using different grammar rules. This If you want to change the language specification for decorator expression and class definition, it should be discussed before merging PR 4382, and I would make corresponding changes in it. In any case it is harder to fix bpo-32023 without fixing the original issue.

    @serhiy-storchaka
    Copy link
    Member Author

    Stefan, [1,2,3] is an expression, but x for x in [1,2,3] is not. If you want to change the Python language specification, please open a topic on Python-Dev and provide the rationale.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 14, 2017

    Yes Sir!

    @gvanrossum
    Copy link
    Member

    It's a (small) mistake that we didn't make the syntax for argument lists in decorators the same as argument lists everywhere else, and that should be fixed to allow exactly what's allowed in regular calls. (That syntax is weird because we don't want e.g. @foo().bar but we do want e.g. @foo.bar().)

    I am honestly not sure that we should change anything here, since the meaning is not actually ambiguous: the syntax for generator expressions doesn't allow e.g. x for x in 1, 2, 3 -- you have to write x for x in (1, 2, 3). (A regular for-loop does allow this, but there the context makes it unambiguous -- that's why genexprs are different.)

    But I'm fine with changing it, as long as we do it consistently.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 9165f77 by Serhiy Storchaka in branch 'master':
    bpo-32012: Disallow trailing comma after genexpr without parenthesis. (bpo-4382)
    9165f77

    @ncoghlan
    Copy link
    Contributor

    With bpo-32023 (base class lists) and 32034 (fixing the documentation for decorator factory function calls) covering the other refinements, this particular issue is done now.

    The most recent PR is the one for bpo-32023.

    @serhiy-storchaka
    Copy link
    Member Author

    New changeset 4b8a7f5 by Serhiy Storchaka in branch 'master':
    Revert "closes bpo-27494: Fix 2to3 handling of trailing comma after a generator expression (GH-3771)" (bpo-8241)
    4b8a7f5

    @miss-islington
    Copy link
    Contributor

    New changeset 9ecbe33 by Miss Islington (bot) in branch '3.7':
    Revert "closes bpo-27494: Fix 2to3 handling of trailing comma after a generator expression (GH-3771)" (GH-8241)
    9ecbe33

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants