Issue 11986: Min/max not symmetric in presence of NaN

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/56195

classification

Title:	Min/max not symmetric in presence of NaN
Type:	behavior	Stage:
Components:		Versions:	Python 3.3

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	Marco Sulla, alex, belopolsky, daniel.urban, mark.dickinson, pitrou, rhettinger
Priority:	normal	Keywords:	gsoc

Created on 2011-05-03 18:31 by belopolsky, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (13)
msg135055 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-03 18:31
>>> nan = float('nan') >>> min(nan, 5) nan >>> min(5, nan) 5 Good arguments can be made in favor of either result, but different value for min(x, y) depending on the order of arguments can hardly be justified. "In the face of ambiguity, refuse the temptation to guess" suggests that min/max with NaN should be an error. See also issue 11949.
msg135056 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-05-03 18:34
Not specific to NaNs: >>> min({1}, {2}) {1} >>> min({2}, {1}) {2}
msg135057 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-05-03 18:41
Undefined ordering means just that.
msg135058 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-03 18:46
On Tue, May 3, 2011 at 2:41 PM, Raymond Hettinger <report@bugs.python.org> wrote: .. > Undefined ordering means just that. Means what? Compare float behavior to Decimal('1') >>> Decimal(1).max(Decimal('nan')) Decimal('1') >>> max(Decimal('1'), Decimal('nan')) Traceback (most recent call last): .. decimal.InvalidOperation: comparison involving NaN Raymond, you don't really need to stop the debate 4 minutes after the bug has been reported.
msg135062 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2011-05-03 19:21
The report is invalid because min/max make no guarantees about values without a total ordering. Your other tracker item correctly focused on the behavior of float('NaN') itself, rather than on the behavior of everything else in the Python world that compares two values.
msg136121 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-16 18:59
rhettinger> Your other tracker item correctly focused on the behavior of rhettinger> float('NaN') itself, You closed issue11949 as well, so it won't help. I disagree that this issue would be resolved by resolving issue11949. Defining max(nan, x) and nan < x are two different issues. Quoting Kahan, """ Some familiar functions have yet to be defined for NaN . For instance max{x, y} should deliver the same result as max{y, x} but almost no implementations do that when x is NaN . There are good reasons to define max{NaN, 5} := max{5, NaN} := 5 though many would disagree. """ -- Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-Point Arithmetic by Prof. W. Kahan http://www.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps In the same lecture, Prof. Kahan states that nan < x must signal. My ideal solution would be to make nan < x signal, keep naive implementation of builtin max() and provide symmetric float.max such that nan.max(x) = x.max(nan) = x (nan result would be a valid but less useful alternative.) This will make float behavior closer to that of Decimal.
msg136467 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:22
> Prof. Kahan states that nan < x must signal. Would that be the sentence that starts "In the syntax of C ..." ?
msg136468 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:32
> keep naive implementation of builtin max() Agreed. > provide symmetric float.max such that nan.max(x) = x.max(nan) = x (nan > result would be a valid but less useful alternative.) That might be viable (a math module function might also make sense here), though it feels a bit YAGNI to me. If we were going to add such a method, it should follow IEEE 754: nan.max(x) == x.max(n) == x, but also nan.min(x) == x.min(nan) == x, for finite x. (See section 5.3.1.)
msg136470 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-21 19:44
On Sat, May 21, 2011 at 3:22 PM, Mark Dickinson <report@bugs.python.org> wrote: > > Mark Dickinson <dickinsm@gmail.com> added the comment: > >> Prof. Kahan states that nan < x must signal. > > Would that be the sentence that starts "In the syntax of C ..." ? This is just sophistry. If Python was more popular than C at the time Prof. Kahan wrote this, he would write "in the syntax of Python." (Not directly on-topic, but Python 3 seems to be moving towards C spelling of operators. I, for one, miss the removal of easy to type '<>' in favor of finger-twisting '!='.)
msg136473 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 19:54
> This is just sophistry. If Python was more popular than C at the > time Prof. Kahan wrote this, he would write "in the syntax of Python." I doubt it. C has a standard that explicitly states that < must signal on comparison with NaNs. Python doesn't. Alexander, I've read both these documents (Kahan's lecture notes and IEEE 754-2008) many many times. I've looked hard in the past for language that would give this exact connection, about < signalling. It just isn't there in either document, and it's dishonest to claim it is.
msg136476 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2011-05-21 20:06
> and it's dishonest to claim it is. This language was going too far, and I apologise for it. I think I need one of those 'wait 5 minutes before allowing you to post' controls.
msg136478 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2011-05-21 20:26
On Sat, May 21, 2011 at 3:32 PM, Mark Dickinson <report@bugs.python.org> wrote: .. > That might be viable (a math module function might also make sense here), though it feels a bit YAGNI to me. I have to admit that it would be YAGNI for most of my code because it uses numpy for numeric calculations. Still, for consistency with decimal, it may be a good addition. Going a bit off-topic, I would like to mention the feature that may actually be quite useful: float.sorting_key() that will return an integer for each float in such a way that keys are ordered in IEEE 754 total ordering. Note that decimal has compare_total() that can be used for sorting, but a cmp-style method is less useful than a key since in py3k sort does not take cmp function anymore. Nice thing about IEEE 754 is that float.sorting_key() can be implemented very efficiently because one can simply use float's binary representation interpreted as an integer for the key. > If we were going to add such a method, it should follow IEEE 754: nan.max(x) == x.max(n) == x, > but also nan.min(x) == x.min(nan) == x, for finite x. (See section 5.3.1.) Agree. Unfortunately, numpy does not do it that way: nan >>> maximum(1.0, nan) nan I am not sure whether this is an argument for or against float.max/min: if numpy had properly defined maximum, I would just recommend to use that.
msg358837 - (view)	Author: Marco Sulla (Marco Sulla) *	Date: 2019-12-23 21:35
marco@buzz:~$ python3.9 Python 3.9.0a0 (heads/master-dirty:d8ca2354ed, Oct 30 2019, 20:25:01) [GCC 9.2.1 20190909] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from decimal import Decimal as Dec, BasicContext as Bctx >>> a = Dec("1981", Bctx) >>> b = Dec("nan", Bctx) >>> a.max(b) Decimal('1981') >>> b.max(a) Decimal('1981') >>> Bctx.max(a, b) Decimal('1981') >>> Bctx.max(b, a) Decimal('1981') `Decimal` completely adheres to IEEE 754 standard. There's a very, very simple and generic solution for builtin min and max: _sentinel = object() def max(*args, key=None, default=_sentinel): args_len = len(args) if args_len == 0: if default is _sentinel: fname = max.__name__ raise ValueError(f"{fname}() expected 1 argument, got 0") return default elif args_len == 1: seq = args[0] else: seq = args it = iter(seq) vmax = next(it, _sentinel) if vmax is _sentinel: if default is _sentinel: fname = max.__name__ raise ValueError(f"{fname}() arg is an empty sequence") return default first_comparable = False if key is None: for val in it: if vmax < val: vmax = val first_comparable = True elif not first_comparable and not val < vmax : # equal, or not comparable object, like NaN vmax = val else: fmax = key(vmax) for val in it: fval = key(val) if fmax < fval : fmax = fval vmax = val first_comparable = True elif not first_comparable and not fval < fmax: fmax = fval vmax = val return vmax This function continues to give undefined behavior with sets... but who calculates the "maximum" or "minimum" of sets?

History
Date	User	Action	Args
2022-04-11 14:57:16	admin	set	github: 56195
2019-12-23 21:35:06	Marco Sulla	set	nosy: + Marco Sulla messages: + msg358837
2011-05-21 20:26:53	belopolsky	set	messages: + msg136478
2011-05-21 20:06:35	mark.dickinson	set	keywords: + gsoc messages: + msg136476
2011-05-21 19:54:46	mark.dickinson	set	messages: + msg136473
2011-05-21 19:44:28	belopolsky	set	messages: + msg136470
2011-05-21 19:32:27	mark.dickinson	set	messages: + msg136468
2011-05-21 19:22:15	mark.dickinson	set	messages: + msg136467
2011-05-17 02:13:53	rhettinger	set	assignee: rhettinger
2011-05-16 18:59:48	belopolsky	set	messages: + msg136121
2011-05-03 19:21:13	rhettinger	set	messages: + msg135062
2011-05-03 18:46:41	belopolsky	set	messages: + msg135058
2011-05-03 18:41:02	rhettinger	set	status: open -> closed nosy: + rhettinger messages: + msg135057 resolution: not a bug
2011-05-03 18:34:21	pitrou	set	nosy: + pitrou messages: + msg135056
2011-05-03 18:31:28	belopolsky	create