This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: dict() built-in fails on iterators with a "keys" attribute
Type: behavior Stage: resolved
Components: Documentation, Interpreter Core Versions: Python 3.6, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: christian.barcenas, docs@python, martin.panter, r.david.murray, rhettinger
Priority: low Keywords:

Created on 2015-07-18 08:54 by christian.barcenas, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (7)
msg246890 - (view) Author: Christian Barcenas (christian.barcenas) * Date: 2015-07-18 08:54
I noticed an inconsistency today between the dict() documentation vs. implementation.

The documentation for the dict() built-in [1] states that the function accepts an optional positional argument that is either a mapping object [2] or an iterable object [3]. 

Consider the following:

    import collections.abc
    
    class MyIterable(object):
        def __init__(self):
            self._data = [('one', 1), ('two', 2)]
            
        def __iter__(self):
            return iter(self._data)
    
    class MyIterableWithKeysMethod(MyIterable):
        def keys(self):
            return "And now for something completely different"
    
    class MyIterableWithKeysAttribute(MyIterable):
        keys = "It's just a flesh wound!"
    
    assert issubclass(MyIterable, collections.abc.Iterable)
    assert issubclass(MyIterableWithKeysMethod, collections.abc.Iterable)
    assert issubclass(MyIterableWithKeysAttribute, collections.abc.Iterable)
    assert not issubclass(MyIterable, collections.abc.Mapping)
    assert not issubclass(MyIterableWithKeysMethod, collections.abc.Mapping)
    assert not issubclass(MyIterableWithKeysAttribute, collections.abc.Mapping)
    
    # OK
    assert dict(MyIterable()) == {'one': 1, 'two': 2}

    # Traceback (most recent call last):
    #   File "<stdin>", line 1, in <module>
    # TypeError: 'MyIterableWithKeysMethod' object is not subscriptable
    assert dict(MyIterableWithKeysMethod()) == {'one': 1, 'two': 2}

    # Traceback (most recent call last):
    # File "<stdin>", line 1, in <module>
    # TypeError: attribute of type 'str' is not callable
    assert dict(MyIterableWithKeysAttribute()) == {'one': 1, 'two': 2}

The last two assertions should not fail, and it appears that the offending code can be found in Objects/dictobject.c's dict_update_common:

    else if (arg != NULL) {
        _Py_IDENTIFIER(keys);
        if (_PyObject_HasAttrId(arg, &PyId_keys))
            result = PyDict_Merge(self, arg, 1);
        else
            result = PyDict_MergeFromSeq2(self, arg, 1);
    }

PyDict_Merge is used to merge key-value pairs if the optional parameter is a mapping, and PyDict_MergeFromSeq2 is used if the parameter is an iterable.

My immediate thought was to substitute the _PyObject_HasAttrId call with PyMapping_Check which I believe would work in 2.7, but due to #5945 I don't think this fix would work in 3.x.

Thoughts?

[1] https://docs.python.org/3.6/library/stdtypes.html#dict
[2] https://docs.python.org/3.6/glossary.html#term-mapping
[3] https://docs.python.org/3.6/glossary.html#term-iterable
msg246892 - (view) Author: Christian Barcenas (christian.barcenas) * Date: 2015-07-18 09:07
Should have clarified that the specific issue that is outlined in #5945 is that PyMapping_Check returns 1 on sequences (e.g. lists), which would mean something like x = [('one', 1), ('two', 2)]; dict(x) would fail in 3.x because x would be incorrectly evaluated as a mapping rather than as an iterable.
msg246893 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-07-18 11:45
Well, this is an example of duck typing, something we commonly do in Python.  I'm not convinced this is a bug.
msg246899 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-07-18 13:06
It is at least an omission from the documentation. The glossary <https://docs.python.org/dev/glossary.html#term-mapping> refers to the Mapping ABC. From Christian’s point of view, the quack of an iterator with just a “keys” attribute sounds more like an iterator than a mapping.

I think the documentation for the dict() constructor should say how to ensure the iterable and mapping modes are triggered. Perhaps dict.update() should also, because it appears to also treat non-dict() “mappings” differently to plain iterators.
msg246907 - (view) Author: Christian Barcenas (christian.barcenas) * Date: 2015-07-18 17:29
I'm aware of duck typing but I don't think this is the right place for it. (Although ABCs are ostensibly a kind of duck typing, as they do not require implementing classes to inherit from the ABC.)

As Martin noticed, the glossary directly defines a "mapping" as a class that implements the Mapping ABC, and likewise the definition of an "iterable" under the glossary would satisfy the Iterable ABC.

I think this is not just a documentation issue: the "quack" of a mapping has been well-defined and consistent since Python 2.7. Same for iterables.

(It is worth noting that 2.6's definition of mapping was indeed just any object with a __getitem__ method <https://docs.python.org/2.7/glossary.html#term-mapping>)

> I think the documentation for the dict() constructor should say how to ensure the iterable and mapping modes are triggered.

Doesn't it do this already by referencing the definitions of "iterable" and "mapping"? These ABCs are used in other built-ins such as any() and eval().
msg246927 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-07-19 00:23
> I'm aware of duck typing but I don't think this 
> is the right place for it. 

The code for dicts is very old, stable, and unlikely to change.  Also, the logic of checking for .keys() is immortalized in the collections.abc.MutableMapping update() method.

For the most part, consumers of iterables, sequences, and mappings are allowed to use duct-typing (this is a feature of the language, not a bug).

The docs can be improved in a number of places.  For example the docstring on the dict constructor is out of sync with the dict.update() method:

    >>> print(dict.__doc__)
    dict() -> new empty dictionary
    dict(mapping) -> new dictionary initialized from a mapping object's
        (key, value) pairs
    dict(iterable) -> new dictionary initialized as if via:
        d = {}
        for k, v in iterable:
            d[k] = v
    dict(**kwargs) -> new dictionary initialized with the name=value pairs
        in the keyword argument list.  For example:  dict(one=1, two=2)
    >>> print(dict.update.__doc__)
    D.update([E, ]**F) -> None.  Update D from dict/iterable E and F.
    If E is present and has a .keys() method, then does:  for k in E: D[k] = E[k]
    If E is present and lacks a .keys() method, then does:  for k, v in E: D[k] = v
    In either case, this is followed by: for k in F:  D[k] = F[k]

In addition, the glossary entries for iterable, sequence, and mapping need to be improved to distinguish between their somewhat vague use in a general sense versus the specific meaning of isinstance(obj, Mapping).  Unless the docs specify a check for the latter, they almost always do some form of duck-typing or a check for concrete built-in class or subclass.

Terms like "mapping" and "sequence" are often used somewhat generally both inside and outside the Python world.  Sometimes mapping is used in the mathematic sense (pairing each member of the domain with each member of the range), http://www.icoachmath.com/math_dictionary/mapping.html, and sometimes in the sense of a subset of dict capabilities (i.e. has __getitem__ and keys).  

The docs for PyMapping_Check() need to be updated to indicate the known limitations of the check and to disambiguate it from isinstance(obj, Mapping).
msg350171 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-22 07:23
This doesn't appear to have been an ongoing source of confusion.  Over time, the existence of the glossary entries and the collections ABCs seems to have provided the needed clarity.
History
Date User Action Args
2022-04-11 14:58:19adminsetgithub: 68847
2019-08-22 07:23:05rhettingersetstatus: open -> closed
resolution: out of date
messages: + msg350171

stage: needs patch -> resolved
2015-07-24 05:35:16rhettingersetpriority: normal -> low
stage: needs patch
2015-07-19 00:23:41rhettingersetmessages: + msg246927
2015-07-18 23:52:33rhettingersetassignee: docs@python -> rhettinger

nosy: + rhettinger
2015-07-18 17:29:22christian.barcenassetmessages: + msg246907
2015-07-18 13:06:24martin.pantersetnosy: + martin.panter
messages: + msg246899
2015-07-18 11:45:44r.david.murraysetnosy: + r.david.murray
messages: + msg246893
2015-07-18 09:07:29christian.barcenassetmessages: + msg246892
2015-07-18 08:55:00christian.barcenascreate