classification
Title: decimal.py: hash of -1
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: amaury.forgeotdarc, mark.dickinson, rhettinger, skrah, terry.reedy
Priority: normal Keywords: needs review, patch

Created on 2010-11-08 12:34 by skrah, last changed 2010-11-21 22:46 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
decimal_hash.patch skrah, 2010-11-08 12:34 review
Messages (19)
msg120738 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-11-08 12:34
When the __hash__ method is called directly, the hash of -1
is -1:

>>> from decimal import *
>>> Decimal(-1).__hash__()
-1
>>> hash(Decimal(-1))
-2


I've a patch, which also sneaks in a ValueError.
msg120759 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-08 15:42
Are there situations where this is a problem?

I don't think that there's any requirement that the __hash__ method for a user-defined Python type not return -1.  (It's also allowed to return values outside the range of hash values;  these get automatically rehashed to values within the range.)
msg120762 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-11-08 16:03
It's not about the hash value, but about consistency: help(Decimal.__hash__) says "x.__hash__() <==> hash(x)", but this is not true for x=Decimal(-1).
msg120769 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-11-08 16:39
Good catch Stefan.  This is a regression from Py2.6.  The behavior for decimal should match that for int.

IDLE 2.6.2      
>>> hash(-1)
-2
>>> (-1).__hash__()
-2
>>> from decimal import *
>>> hash(Decimal(-1))
-2
>>> Decimal(-1).__hash__()
-2
msg121122 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-13 09:01
Okay; go ahead and apply (preferably in two separate commits, since you're fixing two only marginally related issues here).
msg121127 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-13 10:30
The Fraction type has the same behaviour, so I've fixed it to match the proposed new Decimal behaviour in r86448.
msg121350 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-11-17 11:43
Fixed the hash in r86492, excluding the TypeError fix.


Should I fix the TypeError in both 2.7 and 3.2?
msg121375 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-11-17 18:38
I presume you mean this:
         if self._is_special:
             if self.is_snan():
-                raise TypeError('Cannot hash a signaling NaN value.')
+                raise ValueError('Cannot hash a signaling NaN value.')

My understanding is that while exception messages are not part of the API, the class is, even if not documented. If the decimal spec or doc says the above should be ValueError, then TypeError might be fixable in a bug release. Otherwise, I think I would change it only in 3.2 with a version-changed note in the doc. But Raymond is the decider on this.
msg121457 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-18 14:27
I wouldn't risk changing the exception type in 2.7.  It's fine for 3.2.
msg121461 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-11-18 15:24
Thanks for all the comments! I agree that a change in 2.7 might cause
trouble. 

Committed the ValueError in r86517 (py3k).
msg121485 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-11-18 18:48
Should there be a 'versionchanged' note in the doc, even if the error type was not documented?
msg121493 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-11-18 20:20
The choice between ValueError and TypeError can sometimes be ambiguous and seem arbitrary and I understand why you're gravitating towards ValueError (because it works some values and not others), but in this case the API is already fixed by what hash() does elsewhere.

It is no fair to users to have to wrap hash(x) calls with a try/except that catches both exceptions.  So, we should still to a consistent hash API:

  >>> hash([])
  Traceback (most recent call last):
    File "<pyshell#1>", line 1, in <module>
      hash([])
  TypeError: unhashable type: 'list'

In this case, practicality beats purity and released beats unreleased.
msg121497 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-11-18 23:02
Raymond Hettinger <report@bugs.python.org> wrote:
> The choice between ValueError and TypeError can sometimes be ambiguous and seem arbitrary and I understand why you're gravitating towards ValueError (because it works some values and not others), but in this case the API is already fixed by what hash() does elsewhere.
> 
> It is no fair to users to have to wrap hash(x) calls with a try/except that catches both exceptions.  So, we should still to a consistent hash API:
> 
>   >>> hash([])
>   Traceback (most recent call last):
>     File "<pyshell#1>", line 1, in <module>
>       hash([])
>   TypeError: unhashable type: 'list'
> 
> In this case, practicality beats purity and released beats unreleased.

Ok, this makes sense. I can revert the commit unless you prefer to handle
it yourself.
msg121504 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-19 09:33
Hmm.  Does anyone remember the reason for making sNaNs unhashable in the first place.  I recall there was a discussion about this, but can't remember which issue.
msg121505 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-19 09:41
Ah, now I remember: making sNaNs hashable has the potential to introduce seemingly random exceptions with set and dict operations.  The logic went something like:

  (1) if sNaNs are hashable, you can put them in dicts,
  (2) operations on dicts make equality comparisons at (from the
      user's POV) unpredictable times (i.e., when hashes of two
      unequal objects happen to be equal), and
  (3) equality comparisons involving sNaNs raise an exception.

I'm wondering whether we should revisit the decision to have sNaN equalities raise an exception, and just have sNaN equality comparisons behave identically to those for (Decimal or float) NaNs in 3.2.

At any rate, if the code is left as is, the above logic should be added to the __hash__ function as a comment.
msg121509 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-11-19 10:46
If I'm not mistaken, signaling NaNs are only created when the user
explicitly initializes a variable. I see this as direct request to
raise an exception whenever the variable is accessed in a way that
changes the outcome of the program:

This is the example I gave:

http://mail.python.org/pipermail/python-dev/2009-November/093952.html


Now, ideally one would still be allowed to store signaling NaNs in
a dictionary and have them raise at the _exact_ location where they
are used in a mathematical operation or influence control flow.

But since that's not possible, I prefer things as they are.


+1 for adding a comment to the hash function.
msg121510 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-19 11:12
[Stefan]
> ... a direct request to raise an exception...

Understood;  the issue is that this conflicts with the general expectation that equality (and inequality) comparisons always work (at least, for objects that are perceived as immutable).  I think there needs to be a very good reason to have an equality comparison raise an exception, and I don't find this particular reason good enough.  The expected IEEE 754 semantics are still available through the published API:  e.g., using Decimal.compare instead of '=='.

So I'd lean towards having '==' follow Python rules rather than IEEE 754 rules in this case, with Decimal.compare available for the times when the IEEE 754 rules are important.
msg121511 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-19 11:13
Grr.  Horrible formatting on that last comment.
Sorry about that.

Anyway, I'd be interested to hear other people's opinions.
msg122008 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-11-21 22:46
> Committed the ValueError in r86517 (py3k)

As discussed on IRC, I've reverted this change.
History
Date User Action Args
2010-11-21 22:46:32rhettingersetstatus: open -> closed
resolution: fixed -> not a bug
messages: + msg122008
2010-11-19 11:13:07mark.dickinsonsetmessages: + msg121511
2010-11-19 11:12:17mark.dickinsonsetmessages: + msg121510
2010-11-19 10:46:56skrahsetmessages: + msg121509
2010-11-19 09:41:21mark.dickinsonsetmessages: + msg121505
2010-11-19 09:33:06mark.dickinsonsetmessages: + msg121504
2010-11-18 23:02:31skrahsetmessages: + msg121497
2010-11-18 20:28:07rhettingersetassignee: skrah -> rhettinger
2010-11-18 20:20:14rhettingersetstatus: closed -> open

messages: + msg121493
2010-11-18 18:48:20terry.reedysetmessages: + msg121485
2010-11-18 15:24:11skrahsetstatus: open -> closed
resolution: fixed
messages: + msg121461

stage: patch review -> resolved
2010-11-18 14:27:41mark.dickinsonsetmessages: + msg121457
2010-11-17 18:38:25terry.reedysetmessages: + msg121375
2010-11-17 11:43:55skrahsetmessages: + msg121350
2010-11-13 10:30:56mark.dickinsonsetmessages: + msg121127
2010-11-13 09:01:01mark.dickinsonsetassignee: skrah
messages: + msg121122
2010-11-12 18:34:53terry.reedysetnosy: + terry.reedy
2010-11-08 16:39:27rhettingersetnosy: + rhettinger
messages: + msg120769
2010-11-08 16:03:31amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg120762
2010-11-08 15:42:20mark.dickinsonsetmessages: + msg120759
2010-11-08 12:34:02skrahcreate