classification
Title: Documentation for comparing dictionaries is out of date
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: belopolsky, docs@python, eli.bendersky, eric.araujo, orsenthil, stutzbach, terry.reedy
Priority: normal Keywords: easy, patch

Created on 2010-06-30 19:59 by stutzbach, last changed 2010-07-14 20:34 by orsenthil. This issue is now closed.

Files
File name Uploaded Description Edit
issue9132.2.patch eli.bendersky, 2010-07-12 18:06
Messages (10)
msg109009 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2010-06-30 19:59
reference/expressions.html#notin reads:

"Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. [4] Outcomes other than equality are resolved consistently, but are not otherwise defined. [5]"

However, the last sentence is no longer true.  I suggest changing the last sentence to:

"Comparisons other than equality testing raise a TypeError."

I think the footnote [5] is sufficiently historical that it could be removed as well.  Alternately, it could be updated to read, "In older versions of Python, outcomes other than equality were resolved consistently but were not otherwise defined."
msg109029 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-01 02:34
Reference to "sorted (key, value) lists" is a bit misleading as well.  Dicts' equality is defined even if key or values are not orderable.
msg109206 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-04 04:20
The RefMan section is 5.9. Comparisons. The 3.x docs are by design pretty much free of 2.x references. Which is to say, they are a fresh start with 3.0 as the base. So I would also remove footnote 5. Footnote 4 is currently needed because the text suggests something that is not true. Instead fix the text and remove 4.


 I verified both behavior claims with 3.1:
>>> d1 = {1+1j: 2, 1+2j: 3}
>>> d2 = {1+1j: 2, 1+2j: 3}
>>> d1 == d2
True
>>> d1 < d2
Traceback (most recent call last):
  File "<pyshell#14>", line 1, in <module>
    d1 < d2
TypeError: unorderable types: dict() < dict()

PATCH suggestion: Replace the current entry with

"Mappings (dictionaries) compare equal if and only if they have the same key, value) pairs. Order comparision raise TypeError."

Remove footnotes 4 and 5.

"Comparisons other than equality testing raise a TypeError."
is not quite correct because 'comparisons' include 'is' and 'in' which do work as expected.
msg109208 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-07-04 05:07
Thanks for tackling this Terry. Did you include a patch, i.e. a diff
file? If not, the “patch” keyword does not apply, IIUC. Plain English
suggestions are helpful but they’re reviewed in a different way than a diff.

“The 3.x docs are by design pretty much free of 2.x references. Which is
to say, they are a fresh start with 3.0 as the base.“
True. That said, I would leave footnote 4, since it provides a useful
information for people caring about performance, and may interest them
in digging into the details of the implementation. It may need an
explicit mention “in CPython”, though.

“So I would also remove footnote 5.”
I don’t know. It’s an historical note about an implementation detail; it
does no harm (apart maybe taking space for no gain) and may be
interesting to some people. Does a core dev have an opinion on that
(Georg?).

“Footnote 4 is currently needed because the text suggests something that
is not true.”
I can’t say if the text is inaccurate or just difficultly readable.

“"Mappings (dictionaries) compare equal if and only if they have the
same key, value) pairs. Order comparision raise TypeError."”
I’ll even say “the same (key, value) pairs, irregardless of their
order”. Is the term “order comparisons” used in the doc? If not, its
meaning is non-obvious, and I’d like to find something clearer.

“"Comparisons other than equality testing raise a TypeError." is not
quite correct because 'comparisons' include 'is' and 'in' which do work
as expected.”
I thought “is” was clearly identity and “in” membership or containement
testing in the doc. Can you support your claim?

Cheers
msg109257 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-04 19:07
On the 'patch' keyword: I thought the same as you (only for diff) until yesterday morning when Brett Cannon specifically told me that I should also set it for in-message text patches. This was in the context of him having not noticed for months (until I pinged him) that I had included a text patch in a message for an issue assigned to him, because he skims messages and looks for the keyword and other header changes. So until a 'higher' tracker authority tell me differently, I am following his direction. I am also highlighting the actual patch in the message to make it easy to find.


The current "compare equal if and only if their sorted (key, value) lists compare equal. [4]" was written when all objects were comparable and when dict == dict *was* written that way (according to [5]). The footnote was added to assure people that the new internal implementation was faster (ie, set comparison via hash lookup) than it once had been.

For a long time, that definition of dict equality has been invalid because it cannot be implemented that way; the definition is even more meaningless in Python3. So is footnote [4] saying that the meaningless definition is not the implementation. Both [4] and [5] are ancient fossils referring back to Python1. 

Without looking, I am confident that the actual implementation is O(1) if the dicts are unequal in size and an O(n) (key,value) set comparison if they are equal in size. My proposed revision is intended to imply that. Note that Python did not have sets when the sorted list definition was written. Also note that there is no footnote attached to set comparison (which once could have also been defined in terms of sorted lists).

Dicts are sets (of pairs), not unsorted lists (of pairs) and the current docs should reflect that.

The language def makes no guarantees about implementation efficiency.  In 13 years of watching c.l.p/python-list, I have not seen any particular concern about dict comparison that would warrant special notice. Rather, people worry about the big-O behavior of building large lists and dicts of millions of items. The sometimes need persuading that construction is more efficient than they might imagine. That is about the only thing I can think of for which an efficiency note might be a good idea.


'Comparisons other than equality testing' include 'is' and 'in'. They work for dictionaries as expected, which is to say, as identify and containment testing, as for everything else. They do not raise TypeError. So the statement that implies that they do is not correct.


PATCH REVISION: The suggested "Order comparision raise TypeError." has spelling and agreement errors in 'comparision' and should be "Order comparisons raise TypeError."
msg109260 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-07-04 20:07
Thanks for the explanation of the patch keyword. Its description (click on “Keywords”) is indeed generic (“contains patch”), http://www.python.org/dev/patches/ should probably say it too.

I don’t have enough knowledge to make a useful comment on the rest of your reply. Other people most certainly will.

I think two of my remarks are not addressed by your reply:
- Is the term “order comparisons” used in the doc? If not, its
meaning is non-obvious, and I’d like to find something clearer.
- Testing with “is” is called identity testing, “in” is membership or containement testing. Where are they called “comparisons”?
msg109275 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-04 23:55
I intentionally began my first post with "The RefMan section is 5.9. Comparisons." Look there, particularly the definition of comp_operator.

In this context, I think 'order comparison' is pretty clearly a comparison with an order operator, '<' and '>'. But fine with me if the editor want to list them explicitly.

POSSIBLE PATCH ADDITION: "('<', '<=', '>', '>=')"
msg110117 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-12 18:06
I agree with Terry's proposal. Here's a patch file for Doc/reference/expressions.rst that implements the change.
msg110127 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-12 19:51
Looks good to me. Thanks Eli.
msg110326 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-14 20:34
The patch was pretty good. Committed it in r82899 and r82900.
Thanks Eli for the patch & Terry for the review.
History
Date User Action Args
2010-07-14 20:34:50orsenthilsetstatus: open -> closed

nosy: + orsenthil
messages: + msg110326

resolution: fixed
stage: commit review -> resolved
2010-07-12 19:51:58terry.reedysetmessages: + msg110127
stage: needs patch -> commit review
2010-07-12 18:06:41eli.benderskysetfiles: + issue9132.2.patch

messages: + msg110117
2010-07-10 05:03:08eli.benderskysetnosy: + eli.bendersky
2010-07-04 23:55:35terry.reedysetmessages: + msg109275
2010-07-04 20:07:12eric.araujosetmessages: + msg109260
2010-07-04 19:07:20terry.reedysetmessages: + msg109257
2010-07-04 05:07:08eric.araujosetnosy: + eric.araujo
messages: + msg109208
2010-07-04 04:20:07terry.reedysetnosy: + terry.reedy
messages: + msg109206

keywords: + patch
type: behavior
2010-07-01 02:34:11belopolskysetnosy: + belopolsky
messages: + msg109029
2010-06-30 19:59:27stutzbachsetkeywords: + easy
2010-06-30 19:59:10stutzbachcreate