classification
Title: heapq change breaking compatibility
Type: Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: exarkun, fijal, giampaolo.rodola, merwok, python-dev, r.david.murray, rhettinger, therve
Priority: low Keywords: patch

Created on 2008-06-06 12:37 by therve, last changed 2011-04-13 18:51 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
test_heapq.diff exarkun, 2008-06-11 13:19
Messages (24)
msg67765 - (view) Author: Thomas Herve (therve) * Date: 2008-06-06 12:37
A recent change in heapq implements it in terms of less-than:
http://svn.python.org/view/python/trunk/Modules/_heapqmodule.c?rev=63827&r1=63675&r2=63827

Unfortunately, it breaks usage of heapq when a class only implements
__le__ and not __ge__ or __cmp__. This is done this way in Twisted:
http://twistedmatrix.com/trac/browser/trunk/twisted/internet/base.py#L159.

If not mandatory, it would be nice if this change was reverted or that a
backward compatible change was done instead.
msg67779 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-06 18:46
It would be better if the Twisted code changed to define all of the 
rich comparisons instead of relying on an accidental and erroneous 
implementation detail.  The heapq change because other people's code 
that used __lt__ was breaking.  They had some basis for the complaint 
because heapq is documented to match sort() which is also based on 
__lt__ (and so is the bisect module).
msg67780 - (view) Author: Thomas Herve (therve) * Date: 2008-06-06 19:08
Okay then. At least the issue is recorded somewhere, if someone has the
same problem.
msg67781 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2008-06-06 19:28
The heapq documentation isn't very clear about its requirements.  It
does explicitly say that "Heaps are arrays for which heap[k] <=
heap[2*k+1] and heap[k] <= heap[2*k+2] for all k, counting elements from
zero." (this in the module reference for the heapq module, both in the
Python 2.5 version and the in-development version) which might lead one
to believe that <= (__le__) is the important operation.  I don't know
where it is documented that heapq behaves the same as sort().  I think
the documentation needs some improvement to avoid this kind of
confusion.  It's very hard, often impossible, to know what is an
"accidental and erroneous implementation detail" and what is a stable,
public API.

Also, I'm not sure why the code is being changed to accomodate newly
written applications which never could have worked, breaking existing
applications which did work, but I suppose that's just the decision
CPython developers want to make.
msg67783 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-06 20:17
I'll fix this to accommodate both cases, __lt__ and __le__. After 
trying x<y and finding the comparison isn't defined, it can try (not 
y<=x) instead.

Also, will document that either __cmp__ or all six rich comparisons 
should be defined for code that wants to run through sort, bisect, 
min/max, or heapq.  The rich comparison PEP is clear on this point, but 
I don't think the affirmative statement ever made it to main docs:

"The reflexivity rules *are* assumed by Python.  Thus, the
interpreter may swap y>x with x<y, y>=x with x<=y, and may swap
the arguments of x==y and x!=y."  -- PEP 207
msg67788 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-06 21:48
Fixed in r63998.
msg67802 - (view) Author: Thomas Herve (therve) * Date: 2008-06-07 11:48
Unfortunately, the modification didn't fix the problem.
msg67952 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-11 12:08
Thomas, please let me know if r64116 works for you.
msg67964 - (view) Author: Thomas Herve (therve) * Date: 2008-06-11 13:03
Yes, the last commit did the trick. Thanks.
msg67966 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-11 13:17
Would like to make the 3.0 code use __lt__ only.
Any objections?
msg67967 - (view) Author: Thomas Herve (therve) * Date: 2008-06-11 13:18
Sure, that's fine with me.
msg67968 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2008-06-11 13:19
I tried this too and then wrote a couple unit tests for this.  The one
for the Python implementation which tests the case where only __le__ is
defined fails, though.

Diff attached.
msg67969 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-11 13:24
I saw no need to complicate the pure python code for this.

Really, the client code should use __cmp__ or define all six rich 
comparisons.
msg67972 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2008-06-11 13:50
Thanks for the explanation.  Unfortunately, even if we change our code
to work with the new requirements, all the old code is still out there.
 Maybe this doesn't matter, since there are so many other
incompatibilities between Python 2.5 and Python 2.6.  And there aren't
many cases where the extension module isn't available, anyway.  It will
be surprising and probably hard to debug if anyone runs into this, but I
suppose it's possible that no one will.
msg67973 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-06-11 13:54
There should be no cases where the pure python code runs instead of the 
C code.
msg105907 - (view) Author: Maciek Fijalkowski (fijal) Date: 2010-05-17 15:05
Hello.

I would like to complain. It was decided at some point some time ago that both pure-python and C version should run against the same test suite and should not have any differencies. The reasoning behind it is that other python implementations might choose to use pure-python version and we should avoid surprises with random code crashing in obscure ways. Please don't divert deliberately those sources.

Cheers,
fijal
msg105956 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-05-18 07:58
All six of the rich comparisons need to be implemented or the result is undefined.  This module never made guarantees for objects defining only one of the six.

We could change the pure python code to handle both __lt__ and __le__ but that would make it much harder to read and understand.  The C version supports and that is what runs by default.
msg105989 - (view) Author: Maciek Fijalkowski (fijal) Date: 2010-05-18 17:16
I cannot honestly make much sense from what you said. My concern is whether python and C version behaves the same or not. It seems that in current version they intentionally behave differently, for simplicity and it's against policy of having the same functionality. I agree that it's an obscure corner case, but still.
msg106003 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-05-18 19:45
Am closing this.  It would make no sense to change simple, pure python code to support objects implementing only one of the rich comparison methods.  People implementing rich comparisons need to implement all six if they want to guarantee total ordering and to be usable by various modules that need to be able to make comparisons.

FWIW, the C code is not guaranteed to be exactly the same in terms of implementation details, only the published API should be the same.  And, for this module, a decision was made for the C code to support only lists eventhough the pure python version supports any sequence.
msg106038 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-05-19 06:06
For what it's worth, I agree with Fijal.  I think the python version and the C version should behave the same, so that other implementations of Python can use the Python version and be compatible wtih CPython.
msg133671 - (view) Author: Éric Araujo (merwok) * (Python committer) Date: 2011-04-13 15:25
The global docs index has one entry for “comparison”, which is http://docs.python.org/dev/reference/expressions#not-in
This other page says that “in general, __lt__() and __eq__() are sufficient, if you want the conventional meanings of the comparison operators”: http://docs.python.org/dev/library/stdtypes.html#comparisons

Other useful bits:
http://docs.python.org/dev/reference/datamodel#object.__lt__
http://docs.python.org/dev/library/functions#sorted
http://docs.python.org/dev/library/functools#functools.cmp_to_key
http://docs.python.org/dev/howto/sorting#odd-and-ends

It may be useful to add more cross-links between those places (especially pointing to the first link).
msg133682 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-13 18:16
New changeset 103a2eb61069 by Raymond Hettinger in branch '2.7':
Issue 3051: make pure python code pass the same tests as the C version.
http://hg.python.org/cpython/rev/103a2eb61069
msg133683 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-13 18:19
Maciek, I added the compatability code to the Python version as requested.  Now the tests pass for both versions.  There is still work to be done to automatically run both versions against the test suite.
msg133685 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-13 18:51
New changeset 83e4765ec4cb by Raymond Hettinger in branch '3.2':
Issue 3051: make pure python code pass the same tests as the C version.
http://hg.python.org/cpython/rev/83e4765ec4cb
History
Date User Action Args
2011-04-13 18:51:05python-devsetmessages: + msg133685
2011-04-13 18:19:02rhettingersetmessages: + msg133683
2011-04-13 18:16:09python-devsetnosy: + python-dev
messages: + msg133682
2011-04-13 15:25:04merwoksetnosy: + merwok
messages: + msg133671
2010-05-19 06:06:16r.david.murraysetnosy: + r.david.murray
messages: + msg106038
2010-05-18 19:45:34rhettingersetstatus: open -> closed

messages: + msg106003
2010-05-18 17:16:13fijalsetmessages: + msg105989
2010-05-18 07:59:00rhettingersetpriority: normal -> low

messages: + msg105956
2010-05-17 23:12:45giampaolo.rodolasetnosy: + giampaolo.rodola
2010-05-17 15:07:28exarkunsetstatus: closed -> open
2010-05-17 15:05:41fijalsetnosy: + fijal
messages: + msg105907
2008-06-22 22:21:54rhettingersetstatus: open -> closed
2008-06-11 13:54:08rhettingersetmessages: + msg67973
2008-06-11 13:50:15exarkunsetmessages: + msg67972
2008-06-11 13:24:24rhettingersetmessages: + msg67969
2008-06-11 13:19:48exarkunsetfiles: + test_heapq.diff
keywords: + patch
messages: + msg67968
2008-06-11 13:18:00thervesetmessages: + msg67967
2008-06-11 13:17:02rhettingersetpriority: high -> normal
messages: + msg67966
2008-06-11 13:03:25thervesetmessages: + msg67964
2008-06-11 12:08:08rhettingersetmessages: + msg67952
2008-06-09 13:09:20rhettingersetstatus: closed -> open
2008-06-07 11:48:51thervesetmessages: + msg67802
2008-06-06 21:48:04rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg67788
2008-06-06 20:17:49rhettingersetpriority: high
messages: + msg67783
2008-06-06 19:28:45exarkunsetnosy: + exarkun
messages: + msg67781
2008-06-06 19:08:22thervesetmessages: + msg67780
2008-06-06 18:46:08rhettingersetmessages: + msg67779
2008-06-06 14:01:46georg.brandlsetassignee: rhettinger
nosy: + rhettinger
2008-06-06 12:37:36thervecreate