kwargs regression when there are multiple entries with the same key #79815

iceb0y · 2019-01-02T01:04:33Z

BPO	35634
Nosy	@terryjreedy, @serhiy-storchaka, @iceb0y, @ammaraskar, @remilapeyre, @tirkarthi
PRs	bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438 bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438 bpo-35634: Raise an error when first passed kwargs contains duplicated keys. #11438

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/serhiy-storchaka'
closed_at = <Date 2019-01-12.08:13:21.197>
created_at = <Date 2019-01-02.01:04:33.132>
labels = ['interpreter-core', 'type-bug', '3.8']
title = 'kwargs regression when there are multiple entries with the same key'
updated_at = <Date 2019-01-12.08:13:21.196>
user = 'https://github.com/iceb0y'

bugs.python.org fields:

activity = <Date 2019-01-12.08:13:21.196>
actor = 'serhiy.storchaka'
assignee = 'serhiy.storchaka'
closed = True
closed_date = <Date 2019-01-12.08:13:21.197>
closer = 'serhiy.storchaka'
components = ['Interpreter Core']
creation = <Date 2019-01-02.01:04:33.132>
creator = 'iceboy'
dependencies = []
files = []
hgrepos = []
issue_num = 35634
keywords = ['patch', 'patch', 'patch']
message_count = 8.0
messages = ['332849', '332859', '332860', '332874', '333002', '333050', '333053', '333522']
nosy_count = 6.0
nosy_names = ['terry.reedy', 'serhiy.storchaka', 'iceboy', 'ammar2', 'remi.lapeyre', 'xtreak']
pr_nums = ['11438', '11438', '11438']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue35634'
versions = ['Python 3.8']

iceb0y · 2019-01-02T01:04:33Z

Using the multidict package on pypi to illustrate the problem.

Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import multidict
>>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
>>> def foo(**kwargs): pass
... 
>>> foo(**d)
>>> foo(**{}, **d)

Python 3.6.7 (default, Oct 21 2018, 08:08:16) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import multidict
>>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
>>> def foo(**kwargs): pass
... 
>>> foo(**d)
>>> foo(**{}, **d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() got multiple values for keyword argument 'a'

(1) foo(d)
(2) foo({}, **d)

(1) works fine in both versions but (2) only works in Python 3.5 but raises TypeError in Python 3.6. This should be a regression. We should either make both expressions work or raises error.

ammaraskar · 2019-01-02T08:03:07Z

This change in difference is caused by e036ef8

The old code checked for duplicate arguments by essentially running set().intersection(d) and since set().intersection(['a', 'a']) is the empty set, it doesn't register as a duplicated argument. The newer code iterates over the keys in order to merge the dictionaries.

Note however that 3.5 is now is in security only mode: https://devguide.python.org/#branchstatus so its unlikely this behavior will be back-ported.

iceb0y · 2019-01-02T10:07:29Z

I feel like we should not check the argument and allow overriding. If the argument checking is desired, can we also check when there is only a single kwargs? Currently foo(**d) still works in Python 3.6 with duplicated keys.

remilapeyre · 2019-01-02T13:05:45Z

I prepared a patch to make the check when there is only one dict, it may be the less surprising change.

BTW, I got lost when looking for the root cause of this change because grepping for the error message did not work as it has line breaks:

                       PyErr_Format(PyExc_TypeError,

                               "%.200s%.200s got multiple "

                               "values for keyword argument '%U'",

                               PyEval_GetFuncName(func),

                               PyEval_GetFuncDesc(func),

```
                               key);
```

Can we add a note like https://www.kernel.org/doc/html/v4.12/process/coding-style.html#breaking-long-lines-and-strings to the PEP-7?

However, never break user-visible strings such as printk messages, because that breaks the ability to grep for them.

It would help in such situations.

terryjreedy · 2019-01-04T21:17:23Z

Serhiy, nosying you because Ammar identified your commit as relevant.
e036ef8
---

3.6 is also security-fix only.

Normally, code bug reports need a minimal, reproducible, initially-failing test case that only uses the stdlib, not 3rd party code. A test case would have to include a simplified version of a multidict.

The problem here is that a multidict with duplicate keys is not a proper mapping, in spite of having a MutableMapping interface. (Just curious, what does d['a'] return?).

The purpose of the package is to meet the needs of HTTP Headers and URL query strings (and other situations) where the 'keys' are value-type tags, not true mapping keys. Another example would be a bibliography entry that tags each of multiple author names with 'Author:'. (Aside: A deficiency of git (github) is allowing only 1 author key and only one github username as the value.)

Is an object with duplicate keys legal for **expression in calls?
https://docs.python.org/3/reference/expressions.html#index-48
says that expression must evaluate to a 'mapping'. The glossary entry
https://docs.python.org/3/glossary.html#term-mapping says
"A container object that supports arbitrary key lookups and implements the methods specified in the Mapping or MutableMapping abstract base classes." followed by unique-key examples. To me, 'key lookup' implies unique keys. The Mapping functions include 'keys' and 'items'. What signature? To return set-like views, the keys should be unique.

If we take the call to be a bug, is CPython *obligated* to immediately raise an exception? In other words, must every call with **mapping take the time to check for duplicates because someone might pass a dup-key 'mapping'.

serhiy-storchaka · 2019-01-05T08:54:06Z

Ammar is right. This is a consequence of bpo-27358. bpo-27358 was merely an optimization necessary for bytecode changes in 3.6. It was not noticed that it changes also the behavior for multidict kwargs. But I think that it was correct change. Specifying multiple keyword arguments with the same name is an error, and it does not matter whether names are duplicated in different kwargs or in the same kwarg. I think it is worth to ensure that duplicated names are forbidden in the case of a single kwarg too. Since kwarg should be converted to a dict, this check should not add significant overhead.

3.5 and 3.6 are in security-only fixes mode. And I think it is not worth to do such changes in 3.7. This could break working user code (even if can be considered incorrect), and we should avoid a breakage in bugfix releases without need.

I take it.

tirkarthi · 2019-01-05T12:15:27Z

(Just curious, what does d['a'] return?)

I was curious too and some results

$ python
Python 3.7.1rc2 (v3.7.1rc2:6c06ef7dc3, Oct 13 2018, 05:10:29)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import multidict
>>> d = multidict.CIMultiDict([('a', 1), ('a', 2)])
>>> d['a']
1
>>> d.keys()
_KeysView('a', 'a')
>>> d.values()
_ValuesView(1, 2)
>>> d.items()
_ItemsView('a': 1, 'a': 2)
>>> dict(d)
{'a': 1}

In the original issue where PEP-448 was implemented there were some discussions around duplicates in kwargs. msg234413 for Guido's call on duplicates and the messages below discuss some more scenarios about overriding/rejecting duplicates. PEP-448 also has a note on duplicates in https://www.python.org/dev/peps/pep-0448/#specification

serhiy-storchaka · 2019-01-12T08:12:26Z

New changeset f1ec3ce by Serhiy Storchaka in branch 'master':
bpo-35634: Raise an error when first passed kwargs contains duplicated keys. (GH-11438)
f1ec3ce

ammaraskar added the type-bug An unexpected behavior, bug, or error label Jan 2, 2019

terryjreedy added 3.7 (EOL) end of life 3.8 only security fixes labels Jan 4, 2019

serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed 3.7 (EOL) end of life labels Jan 5, 2019

serhiy-storchaka self-assigned this Jan 5, 2019

serhiy-storchaka closed this as completed Jan 12, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kwargs regression when there are multiple entries with the same key #79815

kwargs regression when there are multiple entries with the same key #79815

iceb0y mannequin commented Jan 2, 2019

iceb0y mannequin commented Jan 2, 2019

ammaraskar commented Jan 2, 2019

iceb0y mannequin commented Jan 2, 2019

remilapeyre mannequin commented Jan 2, 2019

terryjreedy commented Jan 4, 2019

serhiy-storchaka commented Jan 5, 2019

tirkarthi commented Jan 5, 2019

serhiy-storchaka commented Jan 12, 2019

kwargs regression when there are multiple entries with the same key #79815

kwargs regression when there are multiple entries with the same key #79815

Comments

iceb0y mannequin commented Jan 2, 2019

iceb0y mannequin commented Jan 2, 2019

ammaraskar commented Jan 2, 2019

iceb0y mannequin commented Jan 2, 2019

remilapeyre mannequin commented Jan 2, 2019

terryjreedy commented Jan 4, 2019

serhiy-storchaka commented Jan 5, 2019

tirkarthi commented Jan 5, 2019

serhiy-storchaka commented Jan 12, 2019