Issue 7279: decimal.py: == and != comparisons involving NaNs

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/51528

classification

Title:	decimal.py: == and != comparisons involving NaNs
Type:	behavior	Stage:	resolved
Components:		Versions:	Python 3.2

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:	mark.dickinson	Nosy List:	mark.dickinson, seberg, skrah
Priority:	high	Keywords:	patch

Created on 2009-11-07 10:43 by skrah, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
issue7279.patch	mark.dickinson, 2010-03-23 20:44

Messages (12)
msg95017 - (view)	Author: Stefan Krah (skrah) *	Date: 2009-11-07 10:43
I'm not sure this is a bug, but I am trying to understand the rationale for mimicking IEEE 754 for == and != comparisons involving NaNs. The comment in decimal.py says: "Note: The Decimal standard doesn't cover rich comparisons for Decimals. In particular, the specification is silent on the subject of what should happen for a comparison involving a NaN." First, I think rich comparisons are covered with compare_total(), but indeed that isn't very useful for == and !=. (It might be useful for sorting a list of decimals.) The standard compare() function returns NaN for comparisons involving NaNs. In addition to that it signals for sNaNs. I'm interpreting this as "the comparison is undefined". So, in terms of decimal return values, the standard does define NaN comparisons. The question remains how to translate "undefined" to a Python truth value. I'd think that the natural thing is to raise an InvalidOperation exception in the same way it is done for <, <=, >, >=. This ... Decimal("NaN") == 9 ==> InvalidOperation Decimal("sNaN") == 9 ==> InvalidOperation ... is the behavior of compare_signal(). In my opinion this would follow the principle of least surprise for the user.
msg95039 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-11-08 09:28
There's a second issue to consider here, which is that Python uses equality as specified by the == operator as the basic equivalence relation for set and dict membership tests. So as a general rule, an equality test between two objects of the same type shouldn't be raising exceptions. If == raised for comparisons with nans then it would make it awkward to put nans into a set. Hmm. But now I notice that you can't put Decimal nans into sets anyway: you get a 'TypeError: Cannot hash a NaN value'. I'm not sure of the rationale for this. One might also question whether Decimal("NaN") < 9 should really be raising InvalidOperation, or whether (as an operation that doesn't return a Decimal instance and is in some sense outside the scope of the standard- --similar to int(Decimal('nan')) and hash(Decimal('nan'))) it should be raising some general Python exception instead. I'm closing this as invalid: the behaviour isn't a bug, at least in the sense that the code is working as designed. I think there may well be a useful discussion here, but the bugtracker isn't the right place to have it: could we move it to python-dev instead?
msg101583 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-03-23 14:57
Re-opening to address a couple of points that came out of the python-dev discussion: (1) As Stefan pointed out on python-dev, equality and inequality comparisons involving signaling nans should signal (order comparisons already do). IEEE 754 is fairly clear on this. From section 6.2: """Signaling NaNs shall be reserved operands that, under default exception handling, signal the invalid operation exception (see 7.2) for every general-computational and signaling-computational operation except for the conversions described in 5.12.""" (Comparisons fall under 'signaling-computational operations, in section 5.6 of the standard.) I propose to fix this for 2.7 and 3.2. (2) Currently hash(Decimal("nan")) raises a TypeError. I can see no good reason for this at all; it's possible to hash float nans and to put them in sets and dictionaries. I propose to remove this restriction for 2.7 and 3.2. I think hash(Decimal("snan")) should also succeed: computational operations on signaling nans should signal, but I don't think that putting a signaling nan into a dict, or checking for its presence in a list, counts as a computational operation for this purpose.
msg101595 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-03-23 19:33
> I think hash(Decimal("snan")) should also succeed On second thoughts, this would be bad, since it would lead to unpredictable results for sets of dicts containing a signaling nan: >>> from decimal import Decimal [69536 refs] >>> s = Decimal('snan'); h = hash(s) [69551 refs] >>> {s, h+1} # can put most integers into a set with an sNaN {Decimal('sNaN'), 373955814} [69561 refs] >>> {s, h} # but not if that integer hashes equal to the sNaN... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 864, in __eq__ ans = self._check_nans(other, context) File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 746, in _check_nans self) File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 3842, in _raise_error raise error(explanation) decimal.InvalidOperation: sNaN [69698 refs] So if __eq__ with an sNaN raises an exception, there's little choice but to prohibit putting sNaNs into sets and dicts, and the obvious way to do this is to make __hash__ raise too.
msg101598 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-03-23 20:44
Here's a patch (against py3k) to make all comparisons involving signaling nans raise InvalidOperation. Stefan, does this look okay to you?
msg101858 - (view)	Author: Stefan Krah (skrah) *	Date: 2010-03-28 11:27
Mark, this looks great, thanks.
msg102152 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-04-02 10:18
Thanks, Stefan. Applied to trunk in r79588. Still needs to be forward ported to py3k.
msg102153 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-04-02 10:36
Allowed hashing of Decimal('nan') in r79589; Decimal('snan') continues to raise TypeError. I've also rewritten Decimal.__hash__ a little bit, so that it won't care if float('inf') raises an exception. This will all be much neater if the unified numeric hash is applied.
msg102240 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-04-03 11:43
r79588 and r79589 were merged to py3k in r79668.
msg182040 - (view)	Author: Sebastian Berg (seberg) *	Date: 2013-02-13 15:29
This is closed, and maybe I am missing something. But from a general point of view, why does hashing of NaN not raise an error as it did for decimals, i.e. why was this not resolved exactly the other way around? I am mostly just wondering about this it is not a problem for me. Hashing NaNs seems dangerous and surprising because it might work in dicts/sets, but normally doesn't. (The only thing for it to work right would be if NaN was a singleton, but that is impossible for subclasses, etc.). The thing is: In [16]: s = {float('nan'): 1, float('nan'): 2, float('nan'): 3} In [17]: s Out[17]: {nan: 1, nan: 2, nan: 3} In [18]: s[float('nan')] KeyError: nan In [19]: n = float('nan') In [20]: s = {n: 1, n: 2, n: 3} In [21]: s Out[21]: {nan: 3} This is because `n is n`, and PyObject_RichCompareBool assumes that if `a is b` then `a == b` which is simply wrong for NaNs and also makes comparisons of iterables including NaNs an impossible business. NaNs have their unavoidable weirdness, but at least for dictionaries/sets it would seem more clear to me if they raised an error.
msg182044 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2013-02-13 15:53
Sebastian: I think this discussion is a bit off-topic for this particular bug; you might want to raise it on python-dev or python-ideas instead. Bear in mind, though, that the behaviour of NaNs and containers has been discussed to death many times in the past; I'd suggest not bringing the issue up again unless there's something genuinely new to bring to the discussion. The current behaviour is certainly a compromise, but it seems to be the best compromise available. Note that with respect to this particular issue: it's only signalling nans that raise on hashing for the Decimal type. Quiet nans are hashable as usual. Since for float, all nans can be regarded as quiet, Decimal and float behave the same way on this.
msg182045 - (view)	Author: Sebastian Berg (seberg) *	Date: 2013-02-13 16:29
Thanks, yes, you are right, should have googled a bit more anyway. Though I did not find much on the hashable vs unhashable itself, so if I ever stumble across it again, I will write a mail...

History
Date	User	Action	Args
2022-04-11 14:56:54	admin	set	github: 51528
2013-02-13 16:29:58	seberg	set	messages: + msg182045
2013-02-13 15:53:50	mark.dickinson	set	messages: + msg182044
2013-02-13 15:29:43	seberg	set	nosy: + seberg messages: + msg182040
2010-04-03 11:43:23	mark.dickinson	set	status: open -> closed resolution: not a bug -> fixed messages: + msg102240 stage: resolved
2010-04-02 10:36:58	mark.dickinson	set	messages: + msg102153
2010-04-02 10:18:12	mark.dickinson	set	messages: + msg102152 versions: + Python 3.2
2010-03-28 11:27:37	skrah	set	messages: + msg101858
2010-03-23 20:44:51	mark.dickinson	set	files: + issue7279.patch keywords: + patch messages: + msg101598
2010-03-23 19:33:07	mark.dickinson	set	messages: + msg101595
2010-03-23 14:57:46	mark.dickinson	set	status: closed -> open assignee: mark.dickinson messages: + msg101583 priority: high
2009-11-08 09:28:49	mark.dickinson	set	status: open -> closed resolution: not a bug messages: + msg95039
2009-11-07 20:42:14	skrah	set	type: behavior
2009-11-07 10:43:51	skrah	create