Issue 35634: kwargs regression when there are multiple entries with the same key

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/79815

classification

Title:	kwargs regression when there are multiple entries with the same key
Type:	behavior	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.8

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	serhiy.storchaka	Nosy List:	ammar2, iceboy, remi.lapeyre, serhiy.storchaka, terry.reedy, xtreak
Priority:	normal	Keywords:	patch, patch, patch

Created on 2019-01-02 01:04 by iceboy, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 11438	merged	serhiy.storchaka, 2019-01-05 16:42
PR 11438	merged	serhiy.storchaka, 2019-01-05 16:42
PR 11438	merged	serhiy.storchaka, 2019-01-05 16:42

Messages (8)
msg332849 - (view)	Author: iceboy (iceboy) *	Date: 2019-01-02 01:04
Using the multidict package on pypi to illustrate the problem. Python 3.5.3 (default, Sep 27 2018, 17:25:39) [GCC 6.3.0 20170516] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import multidict >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)]) >>> def foo(kwargs): pass ... >>> foo(d) >>> foo({}, d) Python 3.6.7 (default, Oct 21 2018, 08:08:16) [GCC 8.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import multidict >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)]) >>> def foo(kwargs): pass ... >>> foo(d) >>> foo({}, d) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: foo() got multiple values for keyword argument 'a' (1) foo(d) (2) foo({}, **d) (1) works fine in both versions but (2) only works in Python 3.5 but raises TypeError in Python 3.6. This should be a regression. We should either make both expressions work or raises error.
msg332859 - (view)	Author: Ammar Askar (ammar2) *	Date: 2019-01-02 08:03
This change in difference is caused by https://github.com/python/cpython/commit/e036ef8fa29f27d57fe9f8cef8d931d4122d8223 The old code checked for duplicate arguments by essentially running `set().intersection(d)` and since `set().intersection(['a', 'a'])` is the empty set, it doesn't register as a duplicated argument. The newer code iterates over the keys in order to merge the dictionaries. Note however that 3.5 is now is in security only mode: https://devguide.python.org/#branchstatus so its unlikely this behavior will be back-ported.
msg332860 - (view)	Author: iceboy (iceboy) *	Date: 2019-01-02 10:07
I feel like we should not check the argument and allow overriding. If the argument checking is desired, can we also check when there is only a single kwargs? Currently `foo(**d)` still works in Python 3.6 with duplicated keys.
msg332874 - (view)	Author: Rémi Lapeyre (remi.lapeyre) *	Date: 2019-01-02 13:05
I prepared a patch to make the check when there is only one dict, it may be the less surprising change. BTW, I got lost when looking for the root cause of this change because grepping for the error message did not work as it has line breaks: - PyErr_Format(PyExc_TypeError, - "%.200s%.200s got multiple " - "values for keyword argument '%U'", - PyEval_GetFuncName(func), - PyEval_GetFuncDesc(func), - key); Can we add a note like https://www.kernel.org/doc/html/v4.12/process/coding-style.html#breaking-long-lines-and-strings to the PEP7? > However, never break user-visible strings such as printk messages, because that breaks the ability to grep for them. It would help in such situations.
msg333002 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2019-01-04 21:17
Serhiy, nosying you because Ammar identified your commit as relevant. https://github.com/python/cpython/commit/e036ef8fa29f27d57fe9f8cef8d931d4122d8223 --- 3.6 is also security-fix only. Normally, code bug reports need a minimal, reproducible, initially-failing test case that only uses the stdlib, not 3rd party code. A test case would have to include a simplified version of a multidict. The problem here is that a multidict with duplicate keys is not a proper mapping, in spite of having a MutableMapping interface. (Just curious, what does d['a'] return?). The purpose of the package is to meet the needs of HTTP Headers and URL query strings (and other situations) where the 'keys' are value-type tags, not true mapping keys. Another example would be a bibliography entry that tags each of multiple author names with 'Author:'. (Aside: A deficiency of git (github) is allowing only 1 author key and only one github username as the value.) Is an object with duplicate keys legal for *expression in calls? https://docs.python.org/3/reference/expressions.html#index-48 says that expression must evaluate to a 'mapping'. The glossary entry https://docs.python.org/3/glossary.html#term-mapping says "A container object that supports arbitrary key lookups and implements the methods specified in the Mapping or MutableMapping abstract base classes." followed by unique-key examples. To me, 'key lookup' implies unique keys. The Mapping functions include 'keys' and 'items'. What signature? To return set-like views, the keys should be unique. If we take the call to be a bug, is CPython obligated* to immediately raise an exception? In other words, must every call with **mapping take the time to check for duplicates because someone might pass a dup-key 'mapping'.
msg333050 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-01-05 08:54
Ammar is right. This is a consequence of issue27358. Issue27358 was merely an optimization necessary for bytecode changes in 3.6. It was not noticed that it changes also the behavior for multidict kwargs. But I think that it was correct change. Specifying multiple keyword arguments with the same name is an error, and it does not matter whether names are duplicated in different kwargs or in the same kwarg. I think it is worth to ensure that duplicated names are forbidden in the case of a single kwarg too. Since kwarg should be converted to a dict, this check should not add significant overhead. 3.5 and 3.6 are in security-only fixes mode. And I think it is not worth to do such changes in 3.7. This could break working user code (even if can be considered incorrect), and we should avoid a breakage in bugfix releases without need. I take it.
msg333053 - (view)	Author: Karthikeyan Singaravelan (xtreak) *	Date: 2019-01-05 12:15
> (Just curious, what does d['a'] return?) I was curious too and some results $ python Python 3.7.1rc2 (v3.7.1rc2:6c06ef7dc3, Oct 13 2018, 05:10:29) [Clang 6.0 (clang-600.0.57)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import multidict >>> d = multidict.CIMultiDict([('a', 1), ('a', 2)]) >>> d['a'] 1 >>> d.keys() _KeysView('a', 'a') >>> d.values() _ValuesView(1, 2) >>> d.items() _ItemsView('a': 1, 'a': 2) >>> dict(d) {'a': 1} In the original issue where PEP 448 was implemented there were some discussions around duplicates in kwargs. msg234413 for Guido's call on duplicates and the messages below discuss some more scenarios about overriding/rejecting duplicates. PEP 448 also has a note on duplicates in https://www.python.org/dev/peps/pep-0448/#specification
msg333522 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-01-12 08:12
New changeset f1ec3cefad4639797c37eaa8c074830188fa0a44 by Serhiy Storchaka in branch 'master': bpo-35634: Raise an error when first passed kwargs contains duplicated keys. (GH-11438) https://github.com/python/cpython/commit/f1ec3cefad4639797c37eaa8c074830188fa0a44

History
Date	User	Action	Args
2022-04-11 14:59:09	admin	set	github: 79815
2019-01-12 08:13:21	serhiy.storchaka	set	keywords: patch, patch, patch status: open -> closed resolution: fixed stage: patch review -> resolved
2019-01-12 08:12:26	serhiy.storchaka	set	messages: + msg333522
2019-01-05 16:42:58	serhiy.storchaka	set	keywords: + patch stage: test needed -> patch review pull_requests: + pull_request10877
2019-01-05 16:42:46	serhiy.storchaka	set	keywords: + patch stage: test needed -> test needed pull_requests: + pull_request10876
2019-01-05 16:42:35	serhiy.storchaka	set	keywords: + patch stage: test needed -> test needed pull_requests: + pull_request10875
2019-01-05 12:15:27	xtreak	set	nosy: + xtreak messages: + msg333053
2019-01-05 08:54:06	serhiy.storchaka	set	assignee: serhiy.storchaka messages: + msg333050 components: + Interpreter Core versions: - Python 3.7
2019-01-04 21:17:23	terry.reedy	set	versions: + Python 3.7, Python 3.8, - Python 3.6 nosy: + serhiy.storchaka, terry.reedy messages: + msg333002 stage: test needed
2019-01-02 13:05:44	remi.lapeyre	set	nosy: + remi.lapeyre messages: + msg332874
2019-01-02 10:07:28	iceboy	set	messages: + msg332860
2019-01-02 08:03:07	ammar2	set	type: behavior messages: + msg332859 nosy: + ammar2
2019-01-02 01:04:33	iceboy	create