classification
Title: decimal.py: == and != comparisons involving NaNs
Type: behavior Stage: resolved
Components: Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: mark.dickinson, seberg, skrah
Priority: high Keywords: patch

Created on 2009-11-07 10:43 by skrah, last changed 2013-02-13 16:29 by seberg. This issue is now closed.

Files
File name Uploaded Description Edit
issue7279.patch mark.dickinson, 2010-03-23 20:44
Messages (12)
msg95017 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2009-11-07 10:43
I'm not sure this is a bug, but I am trying to understand the rationale
for mimicking IEEE 754 for == and != comparisons involving NaNs. The
comment in decimal.py says:

"Note: The Decimal standard doesn't cover rich comparisons for Decimals.
 In particular, the specification is silent on the subject of what
should happen for a comparison involving a NaN."


First, I think rich comparisons are covered with compare_total(), but
indeed that isn't very useful for == and !=. (It might be useful for
sorting a list of decimals.)

The standard compare() function returns NaN for comparisons involving
NaNs. In addition to that it signals for sNaNs. I'm interpreting this as
"the comparison is undefined". So, in terms of decimal return values,
the standard does define NaN comparisons.

The question remains how to translate "undefined" to a Python truth
value. I'd think that the natural thing is to raise an InvalidOperation
exception in the same way it is done for <, <=, >, >=.


This ...

Decimal("NaN") == 9  ==> InvalidOperation
Decimal("sNaN") == 9  ==> InvalidOperation

... is the behavior of compare_signal(). In my opinion this would follow
the principle of least surprise for the user.
msg95039 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-08 09:28
There's a second issue to consider here, which is that Python uses 
equality as specified by the == operator as the basic equivalence relation 
for set and dict membership tests.  So as a general rule, an equality test 
between two objects of the same type shouldn't be raising exceptions.  If 
== raised for comparisons with nans then it would make it awkward to put 
nans into a set.

Hmm.  But now I notice that you can't put Decimal nans into sets anyway:  
you get a 'TypeError: Cannot hash a NaN value'.  I'm not sure of the 
rationale for this.

One might also question whether Decimal("NaN") < 9 should really be 
raising InvalidOperation, or whether (as an operation that doesn't return 
a Decimal instance and is in some sense outside the scope of the standard-
--similar to int(Decimal('nan')) and hash(Decimal('nan'))) it should be 
raising some general Python exception instead.

I'm closing this as invalid: the behaviour isn't a bug, at least in the 
sense that the code is working as designed.  I think there may well be a 
useful discussion here, but the bugtracker isn't the right place to have 
it:  could we move it to python-dev instead?
msg101583 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-23 14:57
Re-opening to address a couple of points that came out of the python-dev discussion:

(1) As Stefan pointed out on python-dev, equality and inequality comparisons involving signaling nans should signal (order comparisons already do).  IEEE 754 is fairly clear on this.  From section 6.2:

"""Signaling NaNs shall be reserved operands that, under default exception handling, signal the invalid operation exception (see 7.2) for every general-computational and signaling-computational operation except for the conversions described in 5.12."""

(Comparisons fall under 'signaling-computational operations, in section 5.6 of the standard.)

I propose to fix this for 2.7 and 3.2.

(2) Currently hash(Decimal("nan")) raises a TypeError.  I can see no good reason for this at all;  it's possible to hash float nans and to put them in sets and dictionaries.  I propose to remove this restriction for 2.7 and 3.2.  I think hash(Decimal("snan")) should also succeed:  *computational* operations on signaling nans should signal, but I don't think that putting a signaling nan into a dict, or checking for its presence in a list, counts as a computational operation for this purpose.
msg101595 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-23 19:33
> I think hash(Decimal("snan")) should also succeed

On second thoughts, this would be bad, since it would lead to unpredictable results for sets of dicts containing a signaling nan:

>>> from decimal import Decimal
[69536 refs]
>>> s = Decimal('snan'); h = hash(s)
[69551 refs]
>>> {s, h+1}    # can put most integers into a set with an sNaN
{Decimal('sNaN'), 373955814}
[69561 refs]
>>> {s, h}      # but not if that integer hashes equal to the sNaN...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 864, in __eq__
    ans = self._check_nans(other, context)
  File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 746, in _check_nans
    self)
  File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 3842, in _raise_error
    raise error(explanation)
decimal.InvalidOperation: sNaN
[69698 refs]

So if __eq__ with an sNaN raises an exception, there's little choice but to prohibit putting sNaNs into sets and dicts, and the obvious way to do this is to make __hash__ raise too.
msg101598 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-03-23 20:44
Here's a patch (against py3k) to make all comparisons involving signaling nans raise InvalidOperation.

Stefan, does this look okay to you?
msg101858 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-03-28 11:27
Mark, this looks great, thanks.
msg102152 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-02 10:18
Thanks, Stefan.  Applied to trunk in r79588.  Still needs to be forward ported to py3k.
msg102153 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-02 10:36
Allowed hashing of Decimal('nan') in r79589;  Decimal('snan') continues to raise TypeError.

I've also rewritten Decimal.__hash__ a little bit, so that it won't care if float('inf') raises an exception.  This will all be much neater if the unified numeric hash is applied.
msg102240 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-03 11:43
r79588 and r79589 were merged to py3k in r79668.
msg182040 - (view) Author: Sebastian Berg (seberg) * Date: 2013-02-13 15:29
This is closed, and maybe I am missing something. But from a general point of view, why does hashing of NaN not raise an error as it did for decimals, i.e. why was this not resolved exactly the other way around? I am mostly just wondering about this it is not a problem for me.

Hashing NaNs seems dangerous and surprising because it might work in dicts/sets, but normally doesn't. (The only thing for it to work right would be if NaN was a singleton, but that is impossible for subclasses, etc.).

The thing is:

In [16]: s = {float('nan'): 1, float('nan'): 2, float('nan'): 3}
In [17]: s
Out[17]: {nan: 1, nan: 2, nan: 3}

In [18]: s[float('nan')]
KeyError: nan

In [19]: n = float('nan')
In [20]: s = {n: 1, n: 2, n: 3}
In [21]: s
Out[21]: {nan: 3}

This is because `n is n`, and PyObject_RichCompareBool assumes that if `a is b` then `a == b` which is simply wrong for NaNs and also makes comparisons  of iterables including NaNs an impossible business. NaNs have their unavoidable weirdness, but at least for dictionaries/sets it would seem more clear to me if they raised an error.
msg182044 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-02-13 15:53
Sebastian:  I think this discussion is a bit off-topic for this particular bug;  you might want to raise it on python-dev or python-ideas instead.  

Bear in mind, though, that the behaviour of NaNs and containers has been discussed to death many times in the past;  I'd suggest not bringing the issue up again unless there's something genuinely new to bring to the discussion.  The current behaviour is certainly a compromise, but it seems to be the best compromise available.

Note that with respect to this particular issue: it's only *signalling* nans that raise on hashing for the Decimal type.  Quiet nans are hashable as usual.  Since for float, all nans can be regarded as quiet, Decimal and float behave the same way on this.
msg182045 - (view) Author: Sebastian Berg (seberg) * Date: 2013-02-13 16:29
Thanks, yes, you are right, should have googled a bit more anyway. Though I did not find much on the hashable vs unhashable itself, so if I ever stumble across it again, I will write a mail...
History
Date User Action Args
2013-02-13 16:29:58sebergsetmessages: + msg182045
2013-02-13 15:53:50mark.dickinsonsetmessages: + msg182044
2013-02-13 15:29:43sebergsetnosy: + seberg
messages: + msg182040
2010-04-03 11:43:23mark.dickinsonsetstatus: open -> closed
resolution: not a bug -> fixed
messages: + msg102240

stage: resolved
2010-04-02 10:36:58mark.dickinsonsetmessages: + msg102153
2010-04-02 10:18:12mark.dickinsonsetmessages: + msg102152
versions: + Python 3.2
2010-03-28 11:27:37skrahsetmessages: + msg101858
2010-03-23 20:44:51mark.dickinsonsetfiles: + issue7279.patch
keywords: + patch
messages: + msg101598
2010-03-23 19:33:07mark.dickinsonsetmessages: + msg101595
2010-03-23 14:57:46mark.dickinsonsetstatus: closed -> open
assignee: mark.dickinson
messages: + msg101583

priority: high
2009-11-08 09:28:49mark.dickinsonsetstatus: open -> closed
resolution: not a bug
messages: + msg95039
2009-11-07 20:42:14skrahsettype: behavior
2009-11-07 10:43:51skrahcreate