msg63258 - (view) |
Author: jason kirtland (jek) |
Date: 2008-03-04 19:49 |
In 2.6a, seems like the __hash__ implementation and __eq__ must be
defined together, in the same class. See also #1549. Ensuring that a
__hash__ implementation isn't being pulled from a builtin type is
probably a sufficient check...?
>>> class Base(object):
... def __init__(self, name):
... self.name = name
... def __eq__(self, other):
... return self.name == other.name
... def __hash__(self):
... return hash(self.name)
...
>>> class Extended(Base):
... def __eq__(self, other):
... print "trace: __eq__ called"
... return Base.__eq__(self, other)
...
>>> hash(Base('b1'))
603887253
>>> hash(Extended('e1'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'Extended'
|
msg63263 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-03-04 21:56 |
This issue is similar to issue1549. Note that only new-style classes are
concerned.
I think the change was intentional. Guido, do you confirm?
However there should be some documentation about it, a NEWS entry or an
item in the "porting to 2.6" section.
The initial modification appeared in the py3k branch (r51454):
"""
Change the way __hash__ is inherited; when __eq__ or __cmp__ is
overridden but __hash__ is not, set __hash__ explicitly to None (and
tp_hash to NULL).
"""
|
msg63264 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-03-04 22:15 |
Wouldn't you expect this sort of thing to break code? Does it meet the
criteria for backporting to 2.6?
|
msg63265 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-03-04 22:17 |
The py3k feature was intentional.
I'm not sure who did the backport (could've been me, long ago). I think
the backport should be replaced with a warning that is only issued when
-3 is given.
|
msg63269 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-03-05 00:00 |
The attached patch reverts r59576 and the part of r59106 about the
tp_hash slot.
It also adds the py3k warning::
type defines __eq__ but not __hash__, and will not be hashable in 3.x
printed when calling hash() on such an object.
|
msg63270 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-03-05 00:05 |
I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal
processing of slot_tp_hash() uses lookup_method(self,"__hash__",hash_str).
I am almost sure that both expressions return the same value.
Is this correct? Py_TYPE(self)->tp_hash seems much faster.
|
msg63271 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-03-05 00:12 |
> I noticed that my patch uses Py_TYPE(self)->tp_hash, whereas normal
> processing of slot_tp_hash() uses lookup_method(self,"__hash__",hash_str).
>
> I am almost sure that both expressions return the same value.
> Is this correct? Py_TYPE(self)->tp_hash seems much faster.
HERE BE DRAGONS
There are cases where Py_TYPE(self)->tp_hash is the function currently
executing (some wrapper(s) in typeobject.c). OTOH, lookup_method(...)
finds the Python code in the class __dict__s along the MRO.
|
msg63272 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-03-05 00:48 |
Ah, I remember that it took me some time to understand the boundaries
between a slot and the corresponding special method.
Here is another version of the patch, which does not test tp_hash while
we are currently running the tp_hash function...
I also added the warning for old-style classes.
|
msg65297 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-10 16:46 |
Hi Amaury, thanks for your work on this. I applied your patch and looked
at the code but there seem to be some issues left still.
(1) I don't think you need to add a warning to classic classes that
define __eq__ without __hash__ -- these aren't hashable and never were.
The problem is purely with new classes AFAICT.
(2) I can't seen to trigger the warning for new classes (and yes, I
specified -3 on the command line :-):
>>> class C(object):
... def __eq__(self, other): return False
...
>>> hash(C())
-134976284
(3) test_Hashable in test_collections.py fails. This is because the
Hashable class believes that list, set, and dict are hashable.
Something odd is going on:
>>> list.__hash__([])
-135100116
but
>>> hash([])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Note that in 2.5, both raise TypeError.
|
msg65298 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-10 16:47 |
Let's try to have this resolved before the next release.
Also let's try not to accidentally merge the 2.6 changes related to this
issue into 3.0. :-)
|
msg65299 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-10 16:51 |
Note: some more tests also fail:
test_descr, test_quopri, test_set
|
msg65329 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-04-10 23:44 |
Here is a new patch, all tests pass.
- no warning for classic classes
- added tests for -3 warnings
I had to add two TPSLOT entries to the slodefs list; can you please
check that this is correct (I wanted to reset tp_hash to NULL when
__eq__ or __cmp__ are defined).
|
msg66399 - (view) |
Author: Barry A. Warsaw (barry) * |
Date: 2008-05-08 03:43 |
Guido, can you comment on Amaury's latest patch? I'm going to bump this
back to critical so as not to hold up 2.6 alpha, but it should be marked
as a release blocker for the first beta.
|
msg68792 - (view) |
Author: Glyph Lefkowitz (glyph) |
Date: 2008-06-26 16:50 |
As barry said, this should have been a release blocker for the first
beta... ;-)
|
msg68795 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-06-26 17:06 |
I'm going to have to ask someone else to look at this code. I am too
busy with too many things to be able to take on a detailed code review.
|
msg68828 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-06-27 13:23 |
Amaury's patch doesn't currently remove the new-in-2.6 code that turns
"tp_hash = NULL" at the C level into "__hash__ = None" at the Python
level. I suspect that will prove to be a problem (and may be the cause
of Django's woes, if I remember Glyph's python-dev posts correctly). I
also don't believe the new TPSLOT entries should be necessary - we
should be able to just revert all of this tp_hash related code to match
the 2.5 implementation, and then see if we can find a way to emit the
Py3k warning when __cmp__ or __eq__ are overridden without overriding
__hash__ without breaking any existing code (if we can't, we may have to
figure out a way to get 2to3 to check for it).
P.S. you'd think I would have learnt by now not to try to make sense of
typeobject.c when I'm tired ;)
|
msg68839 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-06-27 18:05 |
Thanks for giving this some time. I think that backwards compatibility
should be a higher priority than being able to issue -3 warnings -- if
the warnings can't be generated, too bad, we'll just have to document
it. (I don't think that checking things at class scope is an easy task
in 2to3, though I may be underestimating it.)
So, concluding, insofar as the proposal is to revert 2.6 to the 2.5
semantics where it comes to __eq__ and __hash__, I'm okay with that if
there's no other way to maintain backwards compatibility, even though my
preference would be to still flag this when -3 is specified. (There may
be similar backwards compatibility issues caused by stricted arg
checking in __new__ and __init__ -- the same remarks apply there.)
|
msg68866 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-06-28 00:56 |
The _new__/__init__ stuff showed up on my diff as well (obviously) -
that seemed to be doing a pretty good job of maintaining backwards
compatibility.
I'll dig into this further today and tomorrow - I'll probably start by
reverting the relevant code back to exactly what is in 2.5, and then see
how far I can get with adding the warning code back in while still
keeping jek's test case working correctly.
|
msg68937 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-06-29 08:34 |
Well, I think I figured out why the __hash__ changes were backported
from Py3k: without them, the existence of object.__hash__ makes the
Hashable ABC completely useless (every newstyle class meets it).
I see two options here. Option 1 is to revert the changes so __hash__
behave entirely as it did in 2.5 and also delete collections.Hashable
from 2.6 (which may break other backported Py3k items). Doesn't seem
like a particularly good idea.
Option 2 is what I have implemented locally (and will be uploading
soon): changing typeobject.c to allow Py_None to be stored in type
method slots, and have it convert correctly to None at the Python level.
Classes that don't want to inherit the default object.tp_hash
implementation can then write "__hash__ = None" or "(hashfunc)Py_None"
to block the inheritance.
|
msg68948 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-06-29 12:25 |
Attached patch allows classes to explicitly block inheritance of
object.__hash__ by setting the tp_hash slot to Py_None or the __hash__
attribute to None.
This is similar to the approach used in Py3k, but allows __hash__ to be
inherited by default, with classes explicitly disabling it as
appropriate. I'd also suggest forward porting this approach to Py3k as
well, and then possibly look at replacing a copied tp_hash slot with
Py_None when the inherited tp_hash is object.tp_hash, but tp_cmp or
tp_richcompare are different.
|
msg68949 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-06-29 13:30 |
Unassigning for the moment - I'll take a look at adding the Py3k warning
back in at some point, but the main thing I would like now is a sanity
check from Guido or Barry (or anyone else happy to dig into the guts of
typeobject.c) that the basic idea behind my fix is sound.
It would be nice to avoid the explicit check for Py_None in
PyObject_Compare by getting update_one_slot() to replace the Py_None it
encounters in __hash__ with the default slot_tp_hash implementation, but
my attempts at doing that have all led to segfaults.
(Also, glyph, I'd love to know if my patch helps to clear up Django's
issues with 2.6b1)
|
msg69096 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-02 14:42 |
Suggestion from GvR (one I like): instead of re-using Py_None, add a new
C function that is stored in the tp_hash slot as a sentinel instead of
the Py_None value used in the posted version of the patch. This will
avoid breaking code that just checks for NULL before calling the tp_hash
slot directly, while still allowing typeobject.c to detect that the
object isn't actually hashable despite the presence of a non-NULL value
in the tp_hash slot.
(I'll keep the None at the Python level though, since that matches the
Py3k behaviour and plays nicely with collections.Hashable)
|
msg69101 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-07-02 15:15 |
I don't like the changes in listobject.c and dictobject.c: they seem to
imply that all unhashable types must be modified when upgrading to 2.6.
|
msg69129 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-02 21:43 |
Unhashable types that implement an "I'm not hashable" __hash__ method
will indeed require modification in 2.6 in order to avoid incorrectly
passing an "isinstance(obj, collections.Hashable)" check (note that
several of the mutable standard library types such as collections.deque
are incorrectly detected as hashable in the current SVN trunk).
|
msg69137 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-07-02 22:32 |
Isn't that a serious compatibility issue? 2.5 code should work on 2.6
with minimal effort.
For deque objects, I don't get your point:
Python 2.6b1+ (trunk, Jul 1 2008, 22:35:48) [MSC v.1500 32 bit Intel)]
on win32
>>> from collections import deque
>>> hash(deque())
TypeError: deque objects are unhashable
|
msg69190 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-03 10:50 |
As far as deque goes, the following behaviour on the trunk is the
problem I am trying to fix:
Python 2.6b1+ (trunk:64655, Jul 2 2008, 22:48:24)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import deque, Hashable
>>> isinstance(deque(), Hashable)
True
All of the container types that my patch touches were already unhashable
in 2.5 - my patch just ensures that they all correctly return false for
isinstance(obj, collections.Hashable) even after we make object.__hash__
inherited by default again. This is also the reason why simply reverting
the trunk hash implementation to the 2.5 behaviour (which is what I
first tried) is inadequate.
Since collections.Hashable is new in 2.6, I can live with it returning
the wrong answer for types which define __hash__ to always raise an
error (that's a known limitation of the ABC system, even in Py3k).
However, we should at least make sure it returns the correct answer for
all of the types in the standard library.
|
msg69324 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-06 07:52 |
Attached updated patch which implements Guido's suggestion from above
(inheritance of __hash__ is blocked at the C level by setting the slot
to PyObject_HashNotImplemented and at the Python level by setting
__hash__ = None).
All tests which don't depend on a -u resource flag still pass (currently
running a -uall set of tests as well).
|
msg69325 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-06 08:09 |
Two failures in the -uall run (test_codecmaps_hk, test_linuxaudiodev).
However, I see those test failures even after reverting my changes, so
they aren't related to this.
|
msg69691 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-07-15 15:49 |
Fixed for 2.6 in r64962 and the underlying PyObject_HashNotImplemented
mechanic forward ported to 3.0 in r64967.
Still need to update the C API documentation and the library reference,
as well as implement the Py3k warning in 2.6, but I won't get to those
for this beta - dropping priority to deferred blocker.
|
msg70748 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-08-05 16:06 |
How's this going?
|
msg70777 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-06 09:44 |
The documentation and Py3k warning will be ready for the third beta next
week (the remaining part is a lot easier than the initial fix).
|
msg71021 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-11 15:48 |
Py3k warnings for this have been checked in on the trunk as r65642.
There's an unfortunate problem with having to define a fixed stack level
for the warnings - for classes with a metaclass implemented in Python,
the warning message will point to the __new__ method of the metaclass
instead of to the first line of the offending class definition. I don't
see a way around this problem, but it definitely caused me grief in
tracking down some of the misbehaving classes in the standard library
that use ABCMeta as their metaclass in 2.6 (I actually cheated and
recompiled with the stack level on the warnings set to 2 instead of 1).
The update also eliminates the spurious and not-so-spurious Py3k
warnings that this update triggered in the test suite under the -3 flag.
Most were just test suite classes that happen to become unhashable in
Py3k, but a few were classes in the standard lib itself that defined
__eq__ or __cmp__ methods that were incompatible with the default
__hash__ implementation (collections.Mapping, collections.Set,
unittest.TestSuite, xml.dom.minidom.NamedNodeMap, numbers.Number,
UserList.UserList). Instances of all of the affected classes are now
explicitly unhashable in 2.x as well as 3.x (note that any code which
relied on any of them being hashable was already broken due to the fact
that these classes didn't provide the correct hash and equality invariants).
The numbers.Number change deserves special mention - the actual warning
occurs further down in the number stack (when numbers.Complex defines a
new __eq__ method), but it seemed appropriate to block the inheritance
of object.__hash__ in Number since an id() based hash isn't appropriate
for any numeric type. This particular aspect of the change should
probably be forward ported to Py3k.
In implementing the warnings, I'm also pretty happy with the current
Py3k approach that overriding __cmp__ or __eq__ means that __hash__
isn't inherited *at all*, rather than restricting the block solely to
object.__hash__. The current behaviour is simple to explain, and more
importantly, I think it is more correct - any time you change the
definition of equality for the class, you really do need to think
carefully about what it means for mutability and hashability.
P.S. I should never have said this part of the change was going to be
easy, because that was just begging old Murphy to come slap me upside
the head...
|
msg71022 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-11 15:53 |
I still intend to do the necessary documentation updates for this, but
they're going to be delayed a bit since getting the warnings right took
much longer than I expected.
|
msg71223 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-08-16 16:23 |
Let's lower the priority a little then.
|
msg71325 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-18 13:21 |
numbers.Number change forward ported to Py3k in r65808
Docs updated for 2.6 and 3.0 in r65810 and r65811 respectively.
Which means I can finally close this one :)
|
msg72114 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-28 21:48 |
Reopening - still need to fix the Python level docs for hash() and
__hash__().
|
msg72207 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2008-08-31 13:22 |
__hash__() documentation fixed for 2.6 in r66085 and for 3.0 in r66086.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:31 | admin | set | github: 46488 |
2008-08-31 13:22:05 | ncoghlan | set | status: open -> closed resolution: fixed messages:
+ msg72207 |
2008-08-28 21:48:38 | ncoghlan | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg72114 |
2008-08-18 13:21:15 | ncoghlan | set | status: open -> closed resolution: fixed messages:
+ msg71325 |
2008-08-16 16:23:58 | benjamin.peterson | set | priority: release blocker -> critical nosy:
+ benjamin.peterson messages:
+ msg71223 |
2008-08-11 15:53:44 | ncoghlan | set | messages:
+ msg71022 |
2008-08-11 15:48:31 | ncoghlan | set | messages:
+ msg71021 |
2008-08-06 09:44:26 | ncoghlan | set | messages:
+ msg70777 |
2008-08-05 16:06:41 | gvanrossum | set | messages:
+ msg70748 |
2008-07-18 03:45:40 | barry | set | priority: deferred blocker -> release blocker |
2008-07-15 15:49:22 | ncoghlan | set | priority: release blocker -> deferred blocker messages:
+ msg69691 |
2008-07-06 08:09:53 | ncoghlan | set | messages:
+ msg69325 |
2008-07-06 07:52:36 | ncoghlan | set | files:
+ issue2235_fix_hash_inheritance_with_function_ptr.diff messages:
+ msg69324 |
2008-07-03 10:50:33 | ncoghlan | set | assignee: ncoghlan messages:
+ msg69190 |
2008-07-02 22:32:05 | amaury.forgeotdarc | set | messages:
+ msg69137 |
2008-07-02 21:43:23 | ncoghlan | set | messages:
+ msg69129 |
2008-07-02 15:15:24 | amaury.forgeotdarc | set | messages:
+ msg69101 |
2008-07-02 14:42:18 | ncoghlan | set | messages:
+ msg69096 |
2008-06-29 13:30:52 | ncoghlan | set | assignee: ncoghlan -> (no value) messages:
+ msg68949 |
2008-06-29 12:25:39 | ncoghlan | set | files:
+ issue2235_fix_hash_inheritance.diff messages:
+ msg68948 |
2008-06-29 08:34:33 | ncoghlan | set | messages:
+ msg68937 |
2008-06-28 01:02:45 | ncoghlan | set | assignee: ncoghlan |
2008-06-28 00:56:16 | ncoghlan | set | messages:
+ msg68866 |
2008-06-27 18:05:09 | gvanrossum | set | messages:
+ msg68839 |
2008-06-27 13:23:19 | ncoghlan | set | nosy:
+ ncoghlan messages:
+ msg68828 |
2008-06-26 17:06:12 | gvanrossum | set | assignee: gvanrossum -> (no value) messages:
+ msg68795 |
2008-06-26 16:58:44 | schmir | set | nosy:
+ schmir |
2008-06-26 16:50:23 | glyph | set | priority: critical -> release blocker nosy:
+ glyph messages:
+ msg68792 |
2008-05-08 03:43:12 | barry | set | priority: release blocker -> critical nosy:
+ barry messages:
+ msg66399 |
2008-04-15 05:56:19 | nnorwitz | set | assignee: amaury.forgeotdarc -> gvanrossum |
2008-04-10 23:44:06 | amaury.forgeotdarc | set | files:
+ inherit_hash3.patch messages:
+ msg65329 |
2008-04-10 16:51:50 | gvanrossum | set | messages:
+ msg65299 |
2008-04-10 16:47:39 | gvanrossum | set | priority: high -> release blocker messages:
+ msg65298 |
2008-04-10 16:46:57 | gvanrossum | set | assignee: amaury.forgeotdarc messages:
+ msg65297 |
2008-03-16 21:00:11 | eikeon | set | nosy:
+ eikeon |
2008-03-05 00:48:31 | amaury.forgeotdarc | set | files:
+ inherit_hash2.patch messages:
+ msg63272 |
2008-03-05 00:12:29 | gvanrossum | set | messages:
+ msg63271 |
2008-03-05 00:05:33 | amaury.forgeotdarc | set | messages:
+ msg63270 |
2008-03-05 00:00:51 | amaury.forgeotdarc | set | files:
+ inherit_hash.patch keywords:
+ patch messages:
+ msg63269 |
2008-03-04 22:20:13 | rhettinger | set | priority: high |
2008-03-04 22:17:35 | gvanrossum | set | messages:
+ msg63265 |
2008-03-04 22:15:27 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg63264 |
2008-03-04 21:56:09 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc, gvanrossum type: crash -> behavior messages:
+ msg63263 |
2008-03-04 19:49:23 | jek | create | |