Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kwargs regression when there are multiple entries with the same key #79815

Closed
iceb0y mannequin opened this issue Jan 2, 2019 · 8 comments
Closed

kwargs regression when there are multiple entries with the same key #79815

iceb0y mannequin opened this issue Jan 2, 2019 · 8 comments
Assignees
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@iceb0y
Copy link
Mannequin

iceb0y mannequin commented Jan 2, 2019

BPO 35634
Nosy @terryjreedy, @serhiy-storchaka, @iceb0y, @ammaraskar, @remilapeyre, @tirkarthi
PRs
  • bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438
  • bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438
  • bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2019-01-12.08:13:21.197>
    created_at = <Date 2019-01-02.01:04:33.132>
    labels = ['interpreter-core', 'type-bug', '3.8']
    title = 'kwargs regression when there are multiple entries with the same key'
    updated_at = <Date 2019-01-12.08:13:21.196>
    user = 'https://github.com/iceb0y'

    bugs.python.org fields:

    activity = <Date 2019-01-12.08:13:21.196>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2019-01-12.08:13:21.197>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2019-01-02.01:04:33.132>
    creator = 'iceboy'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35634
    keywords = ['patch', 'patch', 'patch']
    message_count = 8.0
    messages = ['332849', '332859', '332860', '332874', '333002', '333050', '333053', '333522']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'serhiy.storchaka', 'iceboy', 'ammar2', 'remi.lapeyre', 'xtreak']
    pr_nums = ['11438', '11438', '11438']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue35634'
    versions = ['Python 3.8']

    @iceb0y
    Copy link
    Mannequin Author

    iceb0y mannequin commented Jan 2, 2019

    Using the multidict package on pypi to illustrate the problem.

    Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
    [GCC 6.3.0 20170516] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import multidict
    >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
    >>> def foo(**kwargs): pass
    ... 
    >>> foo(**d)
    >>> foo(**{}, **d)
    
    Python 3.6.7 (default, Oct 21 2018, 08:08:16) 
    [GCC 8.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import multidict
    >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
    >>> def foo(**kwargs): pass
    ... 
    >>> foo(**d)
    >>> foo(**{}, **d)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: foo() got multiple values for keyword argument 'a'

    (1) foo(d)
    (2) foo(
    {}, **d)

    (1) works fine in both versions but (2) only works in Python 3.5 but raises TypeError in Python 3.6. This should be a regression. We should either make both expressions work or raises error.

    @ammaraskar
    Copy link
    Member

    This change in difference is caused by e036ef8

    The old code checked for duplicate arguments by essentially running set().intersection(d) and since set().intersection(['a', 'a']) is the empty set, it doesn't register as a duplicated argument. The newer code iterates over the keys in order to merge the dictionaries.

    Note however that 3.5 is now is in security only mode: https://devguide.python.org/#branchstatus so its unlikely this behavior will be back-ported.

    @ammaraskar ammaraskar added the type-bug An unexpected behavior, bug, or error label Jan 2, 2019
    @iceb0y
    Copy link
    Mannequin Author

    iceb0y mannequin commented Jan 2, 2019

    I feel like we should not check the argument and allow overriding. If the argument checking is desired, can we also check when there is only a single kwargs? Currently foo(**d) still works in Python 3.6 with duplicated keys.

    @remilapeyre
    Copy link
    Mannequin

    remilapeyre mannequin commented Jan 2, 2019

    I prepared a patch to make the check when there is only one dict, it may be the less surprising change.

    BTW, I got lost when looking for the root cause of this change because grepping for the error message did not work as it has line breaks:

    •                        PyErr_Format(PyExc_TypeError,
      
    •                                "%.200s%.200s got multiple "
      
    •                                "values for keyword argument '%U'",
      
    •                                PyEval_GetFuncName(func),
      
    •                                PyEval_GetFuncDesc(func),
      
    •                                key);
      

    Can we add a note like https://www.kernel.org/doc/html/v4.12/process/coding-style.html#breaking-long-lines-and-strings to the PEP-7?

    However, never break user-visible strings such as printk messages, because that breaks the ability to grep for them.

    It would help in such situations.

    @terryjreedy
    Copy link
    Member

    Serhiy, nosying you because Ammar identified your commit as relevant.
    e036ef8
    ---

    3.6 is also security-fix only.

    Normally, code bug reports need a minimal, reproducible, initially-failing test case that only uses the stdlib, not 3rd party code. A test case would have to include a simplified version of a multidict.

    The problem here is that a multidict with duplicate keys is not a proper mapping, in spite of having a MutableMapping interface. (Just curious, what does d['a'] return?).

    The purpose of the package is to meet the needs of HTTP Headers and URL query strings (and other situations) where the 'keys' are value-type tags, not true mapping keys. Another example would be a bibliography entry that tags each of multiple author names with 'Author:'. (Aside: A deficiency of git (github) is allowing only 1 author key and only one github username as the value.)

    Is an object with duplicate keys legal for **expression in calls?
    https://docs.python.org/3/reference/expressions.html#index-48
    says that expression must evaluate to a 'mapping'. The glossary entry
    https://docs.python.org/3/glossary.html#term-mapping says
    "A container object that supports arbitrary key lookups and implements the methods specified in the Mapping or MutableMapping abstract base classes." followed by unique-key examples. To me, 'key lookup' implies unique keys. The Mapping functions include 'keys' and 'items'. What signature? To return set-like views, the keys should be unique.

    If we take the call to be a bug, is CPython *obligated* to immediately raise an exception? In other words, must every call with **mapping take the time to check for duplicates because someone might pass a dup-key 'mapping'.

    @terryjreedy terryjreedy added 3.7 (EOL) end of life 3.8 only security fixes labels Jan 4, 2019
    @serhiy-storchaka
    Copy link
    Member

    Ammar is right. This is a consequence of bpo-27358. bpo-27358 was merely an optimization necessary for bytecode changes in 3.6. It was not noticed that it changes also the behavior for multidict kwargs. But I think that it was correct change. Specifying multiple keyword arguments with the same name is an error, and it does not matter whether names are duplicated in different kwargs or in the same kwarg. I think it is worth to ensure that duplicated names are forbidden in the case of a single kwarg too. Since kwarg should be converted to a dict, this check should not add significant overhead.

    3.5 and 3.6 are in security-only fixes mode. And I think it is not worth to do such changes in 3.7. This could break working user code (even if can be considered incorrect), and we should avoid a breakage in bugfix releases without need.

    I take it.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed 3.7 (EOL) end of life labels Jan 5, 2019
    @serhiy-storchaka serhiy-storchaka self-assigned this Jan 5, 2019
    @tirkarthi
    Copy link
    Member

    (Just curious, what does d['a'] return?)

    I was curious too and some results

    $ python
    Python 3.7.1rc2 (v3.7.1rc2:6c06ef7dc3, Oct 13 2018, 05:10:29)
    [Clang 6.0 (clang-600.0.57)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import multidict
    >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
    >>> d['a']
    1
    >>> d.keys()
    _KeysView('a', 'a')
    >>> d.values()
    _ValuesView(1, 2)
    >>> d.items()
    _ItemsView('a': 1, 'a': 2)
    >>> dict(d)
    {'a': 1}

    In the original issue where PEP-448 was implemented there were some discussions around duplicates in kwargs. msg234413 for Guido's call on duplicates and the messages below discuss some more scenarios about overriding/rejecting duplicates. PEP-448 also has a note on duplicates in https://www.python.org/dev/peps/pep-0448/#specification

    @serhiy-storchaka
    Copy link
    Member

    New changeset f1ec3ce by Serhiy Storchaka in branch 'master':
    bpo-35634: Raise an error when first passed kwargs contains duplicated keys. (GH-11438)
    f1ec3ce

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants