Issue 11949: Make float('nan') unorderable

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/56158

classification

Title:	Make float('nan') unorderable
Type:	enhancement	Stage:
Components:	Interpreter Core	Versions:	Python 3.3

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	alex, belopolsky, daniel.urban, mark.dickinson, rhettinger
Priority:	normal	Keywords:	patch

Created on 2011-04-28 19:08 by belopolsky, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
unorderable-nans.diff	belopolsky, 2011-04-28 23:44		review

Messages (24)
msg134713 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-04-28 19:08
Rationale: """ IEEE 754 assigns values to all relational expressions involving NaN. In the syntax of C, the predicate x != y is True but all others, x < y , x <= y , x == y , x >= y and x > y, are False whenever x or y or both are NaN, and then all but x != y and x == y are INVALID operations too and must so signal. """ -- Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic by Prof. W. Kahan http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps The problem with faithfully implementing IEEE 754 in Python is that exceptions in IEEE standard don't have the same meaning as in Python. IEEE 754 requires that a value is computed even when the operation signals an exception. The program can then decide whether to terminate computation or propagate the value. In Python, we have to choose between raising an exception and returning the value. We cannot have both. It appears that in most cases IEEE 754 "INVALID" exception is treated as a terminating exception by Python and operations that signal INVALID in IEEE 754 raise an exception in Python. Therefore making <, >, etc. raise on NaN while keeping the status quo for != and == would bring Python floats closer to compliance with IEEE 754. See http://mail.python.org/pipermail/python-ideas/2011-April/010057.html for discussion. An instructive part of the patch is --- a/Lib/test/test_math.py +++ b/Lib/test/test_math.py @@ -174,10 +174,22 @@ flags ) +def is_negative_zero(x): + return x == 0 and math.copysign(1, x) < 0 + +def almost_equal(value, expected): + if math.isfinite(expected) and math.isfinite(value): + return abs(value-expected) <= eps + if math.isnan(expected): + return math.isnan(value) + if is_negative_zero(expected): + return is_negative_zero(value) + return value == expected + class MathTests(unittest.TestCase): def ftest(self, name, value, expected): - if abs(value-expected) > eps: + if not almost_equal(value, expected): Although it may look like proposed change makes it harder to compare floats for approximate equality, the change actually helped to highlight a programming mistake: old ftest() accepts 0.0 where -0.0 is expected. This is a typical situation when someone attempts to write clever code relying on unusual properties of NaNs. In most cases clever code does not account for all possibilities and it is always hard reason about such code.
msg134732 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-04-28 23:44
Actually, my first attempt to fix the test was faulty. The correct logic seems to be +def is_negative_zero(x): + return x == 0 and math.copysign(1, x) < 0 + +def almost_equal(value, expected): + if math.isfinite(expected) and math.isfinite(value): + if is_negative_zero(expected): + return is_negative_zero(value) + if is_negative_zero(value): + return is_negative_zero(expected) + return abs(value-expected) <= eps + if math.isnan(expected): + return math.isnan(value) + return value == expected + class MathTests(unittest.TestCase): + + def test_xxx(self): + self.assertTrue(is_negative_zero(-0.0)) + self.assertFalse(almost_equal(0.0, -0.0)) def ftest(self, name, value, expected): - if abs(value-expected) > eps: + if not almost_equal(value, expected): Now, the attached patch has two failures: AssertionError: fmod(-10,1) returned -0.0, expected 0 and AssertionError: sqrt0002:sqrt(-0.0) returned -0.0, expected 0.0 The first seems to be a typo in the test, but I would not expect sqrt(-0.0) to return -0.0. Does anyone know what the relevant standard says?
msg134734 - (view)	Author: Alex Gaynor (alex) *	Date: 2011-04-29 01:19
The C standard (and/or the POSIX one, I forget) says sqrt(-0.0) returns -0.0.
msg134841 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-04-30 07:21
> sqrt(-0.0) to return -0.0. Does anyone know what the relevant standard says? sqrt(-0.0) should indeed be -0.0, according to IEEE 754.
msg134842 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-04-30 07:23
Alexander, There are lots of almost-equality tests in the test-suite already, between test_math, test_float, test_cmath and test_complex. Do you need to implement another one here, or can you reuse one of the existing ones?
msg135040 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-03 14:52
> There are lots of almost-equality tests in the test-suite already, > between test_math, test_float, test_cmath and test_complex. > Do you need to implement another one here, or can you reuse one > of the existing ones? I can probably use acc_check() instead of abs(value-expected) <= eps, but I am not sure that will be an improvement. Most of the new logic deals with NaNs and negative zero and the almost-equality tests that I've seen don't implement these cases correctly for my use.
msg135045 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-03 16:05
I was thinking of something like the rAssertAlmostEqual method in test_cmath.
msg135059 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-05-03 18:58
Alexander, I urge you to take a good deal of care with this tracker item and not make any changes lightly. Take a look at how other languages have dealt with the issue. Also, consider that "unorderable" may not be the right answer at all. The most common use of NaNs is as a placeholder for missing data. Perhaps putting them at the end of a sort is the right thing to do (c.f. was databases do with NULL values). The other major use for NaNs is a way to let an invalid intermediate result flow through the remainder of a calculation (much as @NA does in MS Excel). The spirit of that use case would suggest that raising an exception during a sort is the wrong thing to do. Another consideration is that it would be unusual (and likely unexpected) to have a type be orderable or not depending on a particular value. Users ask themselves whether floats are orderable, not whether some values of floats are orderable. I strongly oppose this patch in its current form and think it is likely to break existing code that expects NaNs to be quiet.
msg135060 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-05-03 19:00
Also, if you're going to make a change, please consult the scipy/numpy community. They are the most knowledgeable on the subject and the most affected by any change. Given that they have not made any feature requests or bug reports about the current behavior, there is an indication that change isn't necessary or desirable.
msg135132 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-04 14:44
On Tue, May 3, 2011 at 12:05 PM, Mark Dickinson <report@bugs.python.org> wrote: .. > I was thinking of something like the rAssertAlmostEqual method in test_cmath. This one is good. I wonder if it would be appropriate to move rAssertAlmostEqual() up to unitetest.case possibly replacing assertAlmostEqual()? If replacing assertAlmostEqual() is not an option, I would call it assertFloatAlmostEqual().
msg135983 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-14 18:50
It seems we're getting a bit off-topic for the issue title; the discussion about cleaning up test_math (which I agree would be a good thing to do) should probably go into another issue. On the issue itself, I'm -1 on making comparisons with float('nan') raise: I don't see that there's a real problem here that needs solving. Note that the current behaviour does not violate IEEE 754, since there's nothing anywhere in IEEE 754 that says that Python's < operation should raise for comparisons involving NaNs: all that's said is that a conforming language should provide a number of comparison operations (without specifying how those operation should be spelt in the language in question), including both a < operation that's quiet (returning a false value for comparison with NaNs) and a < operation that signals on comparison with NaN. There's nothing to indicate definitively which of these two operations '<' should bind to in a language. It is true that C chooses to bind '<' to the signalling version, but that doesn't automatically mean that we should do the same in Python.
msg135985 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-14 19:00
> Therefore making <, >, etc. raise on NaN while keeping the > status quo for != and == would bring Python floats closer to > compliance with IEEE 754. Not so. Either way, Python would be providing exactly 10 of the 22 required IEEE 754 comparison operations (see sections 5.6.1 and 5.11 of IEEE 754-2008 for details). If we wanted to move closer to compliance with IEEE 754, we should be providing all 22 comparisons.
msg135986 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-05-14 19:08
> On the issue itself, I'm -1 on making comparisons > with float('nan') raise: I don't see that there's > a real problem here that needs solving. > > Note that the current behaviour does not violate IEEE 754, ... I agree with Mark. Am closing this feature request which is both ill-conceived and likely to cause more harm than good (possibly breaking code that currently does not fail). > the discussion about cleaning up test_math > (which I agree would be a good thing to do) > should probably go into another issue. I agree.
msg136111 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-16 17:26
On Sat, May 14, 2011 at 2:50 PM, Mark Dickinson <report@bugs.python.org> wrote: .. > On the issue itself, I'm -1 on making comparisons with float('nan') raise: I don't see that there's a real problem here that needs solving. > I probably should have changed the title of this issue after making an alternative proposal to make INVALID operations produce a warning: http://mail.python.org/pipermail/python-ideas/2011-April/010101.html For the case of nan ordering, this idea seemed to receive support on the mailing list: http://mail.python.org/pipermail/python-ideas/2011-April/010102.html http://mail.python.org/pipermail/python-ideas/2011-April/010103.html http://mail.python.org/pipermail/python-ideas/2011-April/010104.html > Note that the current behaviour does not violate IEEE 754, since there's nothing anywhere > in IEEE 754 that says that Python's < operation should raise for comparisons involving NaNs: > all that's said is that a conforming language should provide a number of comparison operations > (without specifying how those operation should be spelt in the language in question), including > both a < operation that's quiet (returning a false value for comparison with NaNs) and a < > operation that signals on comparison with NaN. There's nothing to indicate definitively which of > these two operations '<' should bind to in a language. > Yes, IEEE 754, provides little guidance to language designers, but why would anyone want to treat ordering of floats differently from ordering of decimals? Traceback (most recent call last): .. decimal.InvalidOperation: comparison involving NaN > It is true that C chooses to bind '<' to the signalling version, but that doesn't automatically mean that we should do the same in Python. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue11949> > _______________________________________ >
msg136115 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-16 18:20
On Sat, May 14, 2011 at 3:08 PM, Raymond Hettinger <report@bugs.python.org> wrote: .. >> Note that the current behaviour does not violate IEEE 754, ... > > I agree with Mark. Do we really need a popular vote to determine what a published standard does or does not require? Section 7.2 of IEEE Std 754-2008 states: """ 7.2 Invalid operation The invalid operation exception is signaled if and only if there is no usefully definable result. In these cases the operands are invalid for the operation to be performed. … For operations producing no result in floating-point format, the operations that signal the invalid operation exception are: ... j) comparison by way of unordered-signaling predicates listed in Table 5.2, when the operands are unordered """ Python comparison operators. We can argue, of course about the proper mapping of IEEE 754 'INVALID' exception to the available Python construct. Arguably, a compliant language can ignore INVALID exceptions, issue a warning while returning result, or raise an exception and produce no result. In a post on Python ideas Mark argued that the ideal disposition of INVALID is a ValueError: """ IMO, the ideal (ignoring backwards compatibility) would be to have OverflowError / ZeroDivisionError / ValueError produced wherever IEEE754 says that overflow / divide-by-zero / invalid-operation should be signaled. """ http://mail.python.org/pipermail/python-ideas/2011-April/010106.html If IEEE 754 compliance is a stated goal in Python design, it would make very little sense to treat some cases of INVALID differently from others. If, however, IEEE 754 compliance is not a goal, we should consider what is the most useful behavior. On the mailing list, I posted a challenge - review your code that will work differently if nan ordering was disallowed and report whether that code does the right thing for all kinds of float (including nan, inf and signed 0). So far, I have not seen any responses to this. My own experiment with the Python library itself, have revealed a bug in the test suit. This matches my prior experience: naive numeric code usually produces nonsense results when nans are compared and careful numeric code makes an effort to avoid comparing nans. > Am closing this feature request which is both ill-conceived and likely to cause more harm than good (possibly breaking code that currently does not fail). > My primary goal in posting this patch was to support the discussion on python-ideas. The patch was not intended to be applied as is. At the minimum, I would need to make nan < nan issue a deprecation warning before turning it into an error. If this is not an appropriate use of the tracker - please propose an alternative. Posting a patch on the mailing list or outside of python.org seems to be a worse alternative. >> the discussion about cleaning up test_math >> (which I agree would be a good thing to do) >> should probably go into another issue. > > I agree. Why? The issue in test_math is small enough that it can be fixed without any discussion on the tracker. If someone would want to improve unittest based on this experience, this can indeed be handled in a separate issue. As long as the changes are limited to Lib/test, I don't see what a separate issue will bring other than extra work.
msg136117 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-16 18:28
A tracker bug has mangled the following paragraph following the IEEE 754 standard quote in my previous post: """ Table 5.2 referenced above lists 10 operations, four of which (>, <, >=, and <=) are given spellings that are identical to the spellings of Python comparison operators. """
msg136466 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:18
> Table 5.2 referenced above lists 10 operations, four of which (>, <, > >=, and <=) are given spellings that are identical to the spellings of > Python comparison operators. Yep, those are included amongst the "various ad-hoc and traditional names and symbols". So what? It's still the case that IEEE 754 gives no requirement (or even recommendation) for how either of 'compareQuietLess' or 'compareSignalingLess' should be spelt in any particular language. IOW, it's fine to argue that you personally would like Python's '<' to be bound to IEEE 754's 'compareSignalingLess' instead of the current effective binding to 'compareQuietLess', but it would be a bit disingenuous to claim that IEEE 754 recommends or requires that. It doesn't.
msg136469 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-21 19:36
On Sat, May 21, 2011 at 3:18 PM, Mark Dickinson <report@bugs.python.org> wrote: > > Mark Dickinson <dickinsm@gmail.com> added the comment: > >> Table 5.2 referenced above lists 10 operations, four of which (>, <, >> >=, and <=) are given spellings that are identical to the spellings of >> Python comparison operators. > > Yep, those are included amongst the "various ad-hoc and traditional names and symbols". So what? > It's still the case that IEEE 754 gives no requirement (or even recommendation) for how either of > 'compareQuietLess' or 'compareSignalingLess' should be spelt in any particular language. IEEE 754 is not a standard that is directly applicable to the design of programming languages. For example, it is completely silent on the issue of which operations should be implemented as infix operators and which as functions. Still, to the extent it is appropriate for IEEE 754 to say so, I think it says that '<' is 'compareSignalingLess'. IEEE 754 can only be a guide for language design and not a specification. However, the decimal module, which was explicitly designed for IEEE 754 compliance, makes order comparison operators signaling. What is the reason to make them quiet for floats other than backward compatibility? Note that backward compatibility is likely not to be an issue if we make nan comparisons generate a warning (possibly even off by default) rather than error.
msg136471 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:48
> What is the reason to make them quiet for floats other > than backward compatibility? For me, none. I'll happily agree that, all other things being equal, it's more natural (and more consistent with other languages) to have < correspond to the signaling operation, and in a new language that's probably what I'd go for. But as a change to existing behaviour in a language that's been widely adopted for numerical work, the risk of breakage seems to me to outweigh any benefits.
msg136472 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:50
On the idea of a warning, I don't really see the point; I find it hard to imagine it's really going to catch many real errors.
msg136474 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-21 19:58
On Sat, May 21, 2011 at 3:50 PM, Mark Dickinson <report@bugs.python.org> wrote: .. > On the idea of a warning, I don't really see the point; I find it hard to imagine it's really going to catch many real errors. My experience is different. In my work, NaNs often creep into calculations that are not designed to deal with them. (More often from data files than from invalid operations.) Sorting a large list with a handful of NaNs, often leads to rather mysterious errors if not to silently wrong results. I believe there was even an issue on the tracker about this particular case.
msg136475 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 20:02
Hmm, okay. Call me +0 on the warning.
msg182314 - (view)	Author: Marc Schlaich (schlamar) *	Date: 2013-02-18 11:28
I'm +1 for a warning. The current behavior is really unexpectable: In [6]: sorted([nan, 0, 1, -1]) Out[6]: [nan, -1, 0, 1] In [7]: sorted([0, 1, -1, nan]) Out[7]: [-1, 0, 1, nan] In [8]: sorted([0, nan, 1, -1]) Out[8]: [0, nan, -1, 1]
msg182362 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2013-02-19 05:54
-1 for a warning. A should really have no expectations about a NaNs sort order. For the most part, Python does not get into warnings business for every possible weird thing you could tell it to do (especially something as harmless as this). Also warnings are a bit of PITA to shut-off. For something like NaN ordering, a warning is likely to inflict more harm on the users than the NaN ordering issue itself.

History
Date	User	Action	Args
2022-04-11 14:57:16	admin	set	github: 56158
2014-03-17 11:06:35	schlamar	set	nosy: - schlamar
2013-02-19 05:54:37	rhettinger	set	messages: + msg182362
2013-02-18 11:28:46	schlamar	set	nosy: + schlamar messages: + msg182314
2011-05-22 02:30:48	rhettinger	set	assignee: rhettinger
2011-05-21 20:02:07	mark.dickinson	set	messages: + msg136475
2011-05-21 19:58:02	belopolsky	set	messages: + msg136474
2011-05-21 19:50:47	mark.dickinson	set	messages: + msg136472
2011-05-21 19:48:17	mark.dickinson	set	messages: + msg136471
2011-05-21 19:36:07	belopolsky	set	messages: + msg136469
2011-05-21 19:18:07	mark.dickinson	set	messages: + msg136466
2011-05-16 18:28:57	belopolsky	set	messages: + msg136117
2011-05-16 18:20:49	belopolsky	set	messages: + msg136115
2011-05-16 17:26:38	belopolsky	set	messages: + msg136111
2011-05-14 19:08:20	rhettinger	set	status: open -> closed resolution: rejected messages: + msg135986
2011-05-14 19:00:18	mark.dickinson	set	messages: + msg135985
2011-05-14 18:50:32	mark.dickinson	set	messages: + msg135983
2011-05-04 14:44:27	belopolsky	set	messages: + msg135132
2011-05-03 19:00:03	rhettinger	set	messages: + msg135060
2011-05-03 18:58:03	rhettinger	set	nosy: + rhettinger messages: + msg135059
2011-05-03 16:05:38	mark.dickinson	set	messages: + msg135045
2011-05-03 14:52:16	belopolsky	set	messages: + msg135040
2011-04-30 07:23:42	mark.dickinson	set	messages: + msg134842
2011-04-30 07:21:09	mark.dickinson	set	messages: + msg134841
2011-04-30 07:15:02	mark.dickinson	set	nosy: + mark.dickinson
2011-04-29 17:13:38	daniel.urban	set	nosy: + daniel.urban
2011-04-29 01:19:00	alex	set	nosy: + alex messages: + msg134734
2011-04-29 00:11:51	belopolsky	set	files: - unorderable-nans.diff
2011-04-28 23:44:12	belopolsky	set	files: + unorderable-nans.diff messages: + msg134732
2011-04-28 19:15:22	belopolsky	set	files: - unorderable-nans.diff
2011-04-28 19:14:58	belopolsky	set	files: + unorderable-nans.diff
2011-04-28 19:08:57	belopolsky	create