classification
Title: repr of -nan value should contain the sign
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: ahrvoje, eric.smith, mark.dickinson
Priority: normal Keywords:

Created on 2016-04-16 18:44 by ahrvoje, last changed 2016-04-19 06:20 by mark.dickinson. This issue is now closed.

Messages (14)
msg263576 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-16 18:44
repr of -nan value should contain the sign so the round-trip could be assured. NaN value sign (bit) could be seen as not relevant or even uninterpretable information, but it is actually used in real-life situations, the fact substantiated by section 6.3 of IEEE-754 2008 standard.

>>> from math import copysign
>>> x = float("-nan")
>>> copysign(1.0, x)
-1.0

This is correct. Also proves the value contains the sign information.

>>> repr(x)
nan

Not correct. Should be '-nan'.
msg263583 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-16 19:45
Reported issue was created in 64-bit Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32.

Now I noticed that in Py 2.7 even copysign part does not work as expected.

Python 2.7.11 (v2.7.11:6d1b6a68f775, Dec  5 2015, 20:40:30) [MSC v.1500 64 bit (AMD64)] on win32:

>>> from math import copysign
>>> x = float("-nan")
>>> copysign(1.0, x)
1.0

Not correct.
msg263593 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2016-04-16 23:55
Changing versions.

I left in 2.7, but I doubt we'd make any changes to 2.7 with regards to this.
msg263628 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-17 21:03
The current behaviour is deliberate, so any change would be an enhancement rather than a bugfix. I'm modifying the versions accordingly.

Unlike the sign of a zero, the sign of a NaN has no useful meaning: IEEE 754 explicitly says "this standard does not interpret the sign of a NaN". Yes, that sign is copied by copysign, but I don't think that in itself means that the sign should be included in the repr, and I'm not aware of any applications where the sign matters in that context.

A NaN also has 51 payload bits (or 52 if you're not distinguishing between quiet and signalling NaNs), but like the sign, those bits are rarely important in applications.

I'm not really seeing a case for representing either the sign or the payload bits in the repr. Do you know of any applications that make use of the sign of a NaN?
msg263629 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-17 21:06
> it is actually used in real-life situations

Do you have any examples available?
msg263631 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-17 21:25
About the Python 2.7 behaviour:

>>> from math import copysign
>>> x = float("-nan")
>>> copysign(1.0, x)
1.0

I'd be interested to know what `struct.pack('<d', x)` shows in this case. I'd expect it to be '\x00\x00\x00\x00\x00\x00\xf8\xff', meaning that the `float` conversion has produced a value with its sign bit set, as expected, but `copysign` has failed to transfer that sign bit. That failure is somewhat expected: older versions of MSVC don't provide copysign, so it has to be emulated, and the emulation doesn't take the sign of NaNs into account. (Getting the sign of a NaN is awkward to do without a native copysign function.) It works as expected on OS X and Linux.

So that's a separate issue: copysign on Windows / Python 2.7 doesn't correctly handle the sign bit of a NaN. I agree that that's less than ideal, but I'm not sure whether it's worth fixing for Python 2.7.
msg263632 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-17 21:31
Gah, sorry. I misdiagnosed the Python 2.7 issue (I was looking at the code for the wrong branch). See issue 22590 for the correct diagnosis.
msg263633 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-17 21:31
Python 2.7.11 (v2.7.11:6d1b6a68f775, Dec  5 2015, 20:40:30) [MSC v.1500 64 bit (AMD64)] on win32:

>>> import struct
>>> x=float("-nan")
>>> struct.pack('<d', x)
'\x00\x00\x00\x00\x00\x00\xf8\x7f'

Looks like the sign bit is off (0) in 2.7.
msg263635 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-17 22:27
Regarding NaN sign bit, IEEE-754 states:

"Note, however, that operations on bit strings—copy, negate, abs, copySign—specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN operand."

So NaN sign bit information is used in the standard itself (section 5.10.d):

1) totalOrder(−NaN, y) is true where −NaN represents a NaN with negative sign bit and y is a
   floating-point number.
2) totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a
   floating-point number.

This fact alone implies the importance of its round-trip safety. I believe the quote you picked states this information doesn't have universal (standardized) meaning, not it is not important or used at all. It is reasonable to assume some will need to preserve it.

There are also some real-life usages, similar as signed zero, in determining the correct complex plane branch cut:
http://stackoverflow.com/questions/8781072/sign-check-for-nan-value
http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/catan.3M.html
catan(inf + iNAN) => π/2 + i0; catan(inf - iNAN) => π/2 - i0;

MSVC 2015 and MinGW-W64 v4.9.2 output (source below) for '-nan' values are:

MSVC:      -nan(ind)
MinGW-W64: -1.#IND00


#include <stdio.h>

int main()
{
    double x = 0.0;
    x = - x / x;
    printf("%lf\n", x);

    return 0;
}
msg263637 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-17 23:10
Sage:
http://doc.sagemath.org/html/en/reference/rings_numerical/sage/rings/complex_number.html

>>> log(ComplexNumber(NaN,1))
NaN - NaN*I
msg263677 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-18 18:33
> Looks like the sign bit is off (0) in 2.7.

Yep, that looks like issue 22590. It's "fixed" in Python 3, and I don't think it's worth changing in Python 2.

About this issue (sign in repr of NaN): sorry, but I'm unconvinced. :-)

From the standpoint of IEEE 754, the only part that's really relevant is section 5.12 ("Details of conversion between floating-point data and external character sequences"), and there's nothing in that section indicating that a sign is either recommended or required. The most relevant part is this one:

"""
Conversion of a quiet NaN in a supported format to an external character sequence shall produce a language-defined one of “nan” or a sequence that is equivalent except for case (e.g., “NaN”), with an optional preceding sign. (This standard does not interpret the sign of a NaN.)
"""

Yes, it's true that copysign copies the sign bit, but I don't see the leap from there to saying that the *repr* of a NaN should include it. total_order doesn't really seem relevant here, since Python doesn't implement it, and if it were implemented that implementation would almost certainly not be via the repr.

To your catan example: Python follows Annex G of the C99 specification for its cmath functions. catan is specified in terms of catanh, and the relevant line there is:
"""
catanh(NaN + i∞) returns ±0 + iπ /2 (where the sign of the real part of the result is unspecified).
"""
Note that the sign of the zero is unspecified here. There's no part of Annex G that introduces any dependence on the sign of a NaN. So no, unlike the sign of a zero, the sign of a NaN does *not* play a role in selecting which side of a branch cut an input to a complex function lies on. (The branch cuts for cmath.atan are both on the imaginary axis, with real part 0; while it's not really clear what kind of topology applies to a C99-style extended complex plane, it would be a stretch to regard inf +/- iNaN as being anywhere on or near that branch cut.)

The roundtrip argument doesn't really hold water either: a quiet NaN has 52 extra bits of information - a sign bit and a 51-bit payload; if we were attempting to roundtrip the bits, we'd need to include the payload information too. I don't see that it's any more important to distinguish NaNs with different signs than to distinguish NaNs with different payloads.

> MSVC 2015 and MinGW-W64 v4.9.2 output (source below) for '-nan' values are:
> MSVC:      -nan(ind)
> MinGW-W64: -1.#IND00

Sure, and clang on OS X produces "nan" for the same source.

> It is reasonable to assume some will need to preserve it.

Maybe, but I've read and worked with a lot of numerical code over the last two and a half decades, and I have yet to see any whose correctness depends on interpreting the sign of a NaN. Access to the sign bit of a NaN is a rare need, and not something that needs be included in the repr. For those who really do need it for some kind of custom use-case, it's not hard to use copysign to extract it.

In sum, I don't see any benefit to adding the sign, and there's an obvious drawback in the form of code breakage (in doctests, for example).
msg263686 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-18 19:22
This StackOverflow answer from one of the IEEE 754 committee members is highly relevant here:

http://stackoverflow.com/a/21350299/270986
msg263688 - (view) Author: Hrvoje Abraham (ahrvoje) Date: 2016-04-18 21:08
IEEE & C/C++ standards allow and explicitly mention it, some people and projects are using it, many compilers preserve it...

I believe it's reasonable to support it despite the fact it does not have standardized semantic meaning. Maybe one day...
msg263711 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-04-19 06:20
The sign of a NaN *is* fully supported, and easily accessible to those who need it. It just isn't part of the repr. Given that the sign is meaningless in the vast majority of applications, I think Python does the right thing here by leaving it out of the repr, rather than cluttering up the repr with meaningless and potentially confusing information.

I'll leave the last word to Stephen Canon, from the SO answer I linked to above:

" it is almost always a bug to attach meaning to the "sign bit" of a NaN datum."
History
Date User Action Args
2016-04-19 06:20:12mark.dickinsonsetstatus: open -> closed
resolution: rejected
messages: + msg263711

stage: resolved
2016-04-18 21:08:27ahrvojesetmessages: + msg263688
2016-04-18 19:22:43mark.dickinsonsetmessages: + msg263686
2016-04-18 18:35:08mark.dickinsonsettype: behavior -> enhancement
2016-04-18 18:33:10mark.dickinsonsetmessages: + msg263677
2016-04-17 23:10:30ahrvojesetmessages: + msg263637
2016-04-17 22:27:49ahrvojesetmessages: + msg263635
2016-04-17 21:31:26ahrvojesetmessages: + msg263633
2016-04-17 21:31:09mark.dickinsonsetmessages: + msg263632
2016-04-17 21:25:49mark.dickinsonsetmessages: + msg263631
2016-04-17 21:06:04mark.dickinsonsetmessages: + msg263629
2016-04-17 21:03:04mark.dickinsonsetmessages: + msg263628
versions: - Python 2.7, Python 3.4, Python 3.5
2016-04-16 23:55:29eric.smithsetmessages: + msg263593
versions: - Python 3.2, Python 3.3
2016-04-16 23:54:11eric.smithsetnosy: + eric.smith
2016-04-16 19:45:06ahrvojesetmessages: + msg263583
2016-04-16 18:50:50serhiy.storchakasetnosy: + mark.dickinson
2016-04-16 18:44:53ahrvojecreate