classification
Title: Clarify hash() constancy period
Type: enhancement Stage: needs patch
Components: Documentation Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: alex, docs@python, jcea, lemburg, loewis, pitrou, rhettinger, terry.reedy
Priority: normal Keywords:

Created on 2012-01-04 00:40 by terry.reedy, last changed 2012-01-04 09:46 by rhettinger. This issue is now closed.

Messages (9)
msg150561 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-04 00:40
Current 3.2.2 docs:

id(object) Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. [model]

hash(object) Return the hash value of the object (if it has one). Hash values are integers. They are used to quickly compare dictionary keys 

Suggestion: change "Hash values are integers. They ..." to
"This should be an integer which is constant for this object during its lifetime. Hash values ..."

Rationale: For builtin class instances, hash values are guaranteed to be constant that long, and only that long, as the default hash(ob) for object() instances is currently, for my win7, 64 bit, 3.2.2 CPython, id(ob) // 16 (the minimum object size). User class instance hashes (with custom __hash__) *should* have the same lifetime. But since Python cannot enforce that, I did not say 'guaranteed'.

User code should *not* depend on a longer lifetime, just as for id() output. It seems worth implying that, as for id(), because (based on recent pydev discussion) people seems to be prone to over-generalize the current longer-term stability of number and string hashes, which itself may disappear in future releases. (see #13703)
msg150564 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-01-04 01:17
-1. The hash has nothing to do with the lifetime, but with the value of an object.
msg150572 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-04 02:38
Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value' changes? My understanding is that changing hash values for objects in sets and dicts is bad, which is why mutable builtins with value-based equality do not have hash values.
msg150573 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-04 02:40
You can define a __hash__ that changes if the object changes. It is not recommended, but it's possible. So I agree with Martin that your proposed clarification is wrong.
(I also think that it wouldn't bring anything, either)

Suggest closing as invalid/rajected.
msg150585 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-04 04:07
Given that the doc says that use of hash() is to compare dict keys, it does not seem wrong to me to suggest that hash() should be usable to do so.

I believe id() and consequently hash() are unique among builtins in being run-dependent. That is currently documented for id() but not for hash(). Given that people seriously asked whether we can randomize hash() with each run, because 'people' 'expect' it to remain rather constant, it does not seem useless to clarify that it can change with each run. I am sure my wording could be improved. An alternative would be 'Hash values for built-in objects are constant for each run but not necessarily thereafter."

If you take into account what people can do with special methods, some of the other entries seem more wrong that my suggestion. For instance:
"len(s) Return the length (the number of items) of an object." and
"str(obj ... When only object is given, this returns its nicely printable representation." These are true only for built-in objects, but the policy is to leave out the qualification.
msg150586 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-01-04 04:38
-1 I concur with Martin.
msg150595 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012-01-04 08:17
> Martin, I do not understand. The default hash is based on id (as is
> default equality comparison), not value.

In the default implementation, the id *is* the object's value (i.e.
objects, by default, only compare equal if they are identical). So
the default implementation is just a special case of the more general
rule that hashes need to be consistent with equality.

> Are you OK with hash values changing if the 'value' changes?

An object that can change its value (i.e. a mutable object) should
fail to hash.
msg150596 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2012-01-04 09:13
Terry J. Reedy wrote:
> 
> Terry J. Reedy <tjreedy@udel.edu> added the comment:
> 
> Martin, I do not understand. The default hash is based on id (as is default equality comparison), not value. Are you OK with hash values changing if the 'value' changes? My understanding is that changing hash values for objects in sets and dicts is bad, which is why mutable builtins with value-based equality do not have hash values.

Hash values are based on the object values, not their id(). See the
various type implementations as reference. The id() is only used
as hash for objects which don't have a "value" (and thus cannot be
compared).

Given that we have the invariant "a==b => hash(a)==hash(b)" in Python,
it immediately follows that hash values for objects with comparison
method cannot have a lifetime - at least not within the same process
and, depending how you look at it, also not in multi-process
applications.
msg150599 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-01-04 09:46
[Antoine]
> Suggest closing as invalid/rajected.

[Martin]
> -1. The hash has nothing to do with the lifetime, 
> but with the value of an object.
History
Date User Action Args
2012-01-04 09:46:43rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg150599
2012-01-04 09:13:55lemburgsetnosy: + lemburg
messages: + msg150596
2012-01-04 08:17:41loewissetmessages: + msg150595
2012-01-04 05:07:20jceasetnosy: + jcea
2012-01-04 04:38:49rhettingersetnosy: + rhettinger
messages: + msg150586
2012-01-04 04:07:21terry.reedysetmessages: + msg150585
2012-01-04 02:40:37pitrousetnosy: + pitrou
messages: + msg150573
2012-01-04 02:38:49terry.reedysetmessages: + msg150572
title: Clarify hash() lifetime -> Clarify hash() constancy period
2012-01-04 01:17:13loewissetnosy: + loewis
messages: + msg150564
2012-01-04 00:56:28alexsetnosy: + alex
2012-01-04 00:40:52terry.reedycreate