Message 263677 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	ahrvoje, eric.smith, mark.dickinson
Date	2016-04-18.18:33:10
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1461004390.47.0.729784211263.issue26785@psf.upfronthosting.co.za>
In-reply-to

Content
> Looks like the sign bit is off (0) in 2.7. Yep, that looks like issue 22590. It's "fixed" in Python 3, and I don't think it's worth changing in Python 2. About this issue (sign in repr of NaN): sorry, but I'm unconvinced. :-) From the standpoint of IEEE 754, the only part that's really relevant is section 5.12 ("Details of conversion between floating-point data and external character sequences"), and there's nothing in that section indicating that a sign is either recommended or required. The most relevant part is this one: """ Conversion of a quiet NaN in a supported format to an external character sequence shall produce a language-defined one of “nan” or a sequence that is equivalent except for case (e.g., “NaN”), with an optional preceding sign. (This standard does not interpret the sign of a NaN.) """ Yes, it's true that copysign copies the sign bit, but I don't see the leap from there to saying that the repr of a NaN should include it. total_order doesn't really seem relevant here, since Python doesn't implement it, and if it were implemented that implementation would almost certainly not be via the repr. To your catan example: Python follows Annex G of the C99 specification for its cmath functions. catan is specified in terms of catanh, and the relevant line there is: """ catanh(NaN + i∞) returns ±0 + iπ /2 (where the sign of the real part of the result is unspecified). """ Note that the sign of the zero is unspecified here. There's no part of Annex G that introduces any dependence on the sign of a NaN. So no, unlike the sign of a zero, the sign of a NaN does not play a role in selecting which side of a branch cut an input to a complex function lies on. (The branch cuts for cmath.atan are both on the imaginary axis, with real part 0; while it's not really clear what kind of topology applies to a C99-style extended complex plane, it would be a stretch to regard inf +/- iNaN as being anywhere on or near that branch cut.) The roundtrip argument doesn't really hold water either: a quiet NaN has 52 extra bits of information - a sign bit and a 51-bit payload; if we were attempting to roundtrip the bits, we'd need to include the payload information too. I don't see that it's any more important to distinguish NaNs with different signs than to distinguish NaNs with different payloads. > MSVC 2015 and MinGW-W64 v4.9.2 output (source below) for '-nan' values are: > MSVC: -nan(ind) > MinGW-W64: -1.#IND00 Sure, and clang on OS X produces "nan" for the same source. > It is reasonable to assume some will need to preserve it. Maybe, but I've read and worked with a lot of numerical code over the last two and a half decades, and I have yet to see any whose correctness depends on interpreting the sign of a NaN. Access to the sign bit of a NaN is a rare need, and not something that needs be included in the repr. For those who really do need it for some kind of custom use-case, it's not hard to use copysign to extract it. In sum, I don't see any benefit to adding the sign, and there's an obvious drawback in the form of code breakage (in doctests, for example).

> Looks like the sign bit is off (0) in 2.7.

Yep, that looks like issue 22590. It's "fixed" in Python 3, and I don't think it's worth changing in Python 2.

About this issue (sign in repr of NaN): sorry, but I'm unconvinced. :-)

From the standpoint of IEEE 754, the only part that's really relevant is section 5.12 ("Details of conversion between floating-point data and external character sequences"), and there's nothing in that section indicating that a sign is either recommended or required. The most relevant part is this one:

"""
Conversion of a quiet NaN in a supported format to an external character sequence shall produce a language-defined one of “nan” or a sequence that is equivalent except for case (e.g., “NaN”), with an optional preceding sign. (This standard does not interpret the sign of a NaN.)
"""

Yes, it's true that copysign copies the sign bit, but I don't see the leap from there to saying that the *repr* of a NaN should include it. total_order doesn't really seem relevant here, since Python doesn't implement it, and if it were implemented that implementation would almost certainly not be via the repr.

To your catan example: Python follows Annex G of the C99 specification for its cmath functions. catan is specified in terms of catanh, and the relevant line there is:
"""
catanh(NaN + i∞) returns ±0 + iπ /2 (where the sign of the real part of the result is unspecified).
"""
Note that the sign of the zero is unspecified here. There's no part of Annex G that introduces any dependence on the sign of a NaN. So no, unlike the sign of a zero, the sign of a NaN does *not* play a role in selecting which side of a branch cut an input to a complex function lies on. (The branch cuts for cmath.atan are both on the imaginary axis, with real part 0; while it's not really clear what kind of topology applies to a C99-style extended complex plane, it would be a stretch to regard inf +/- iNaN as being anywhere on or near that branch cut.)

The roundtrip argument doesn't really hold water either: a quiet NaN has 52 extra bits of information - a sign bit and a 51-bit payload; if we were attempting to roundtrip the bits, we'd need to include the payload information too. I don't see that it's any more important to distinguish NaNs with different signs than to distinguish NaNs with different payloads.

> MSVC 2015 and MinGW-W64 v4.9.2 output (source below) for '-nan' values are:
> MSVC:      -nan(ind)
> MinGW-W64: -1.#IND00

Sure, and clang on OS X produces "nan" for the same source.

> It is reasonable to assume some will need to preserve it.

Maybe, but I've read and worked with a lot of numerical code over the last two and a half decades, and I have yet to see any whose correctness depends on interpreting the sign of a NaN. Access to the sign bit of a NaN is a rare need, and not something that needs be included in the repr. For those who really do need it for some kind of custom use-case, it's not hard to use copysign to extract it.

In sum, I don't see any benefit to adding the sign, and there's an obvious drawback in the form of code breakage (in doctests, for example).

History
Date	User	Action	Args
2016-04-18 18:33:10	mark.dickinson	set	recipients: + mark.dickinson, eric.smith, ahrvoje
2016-04-18 18:33:10	mark.dickinson	set	messageid: <1461004390.47.0.729784211263.issue26785@psf.upfronthosting.co.za>
2016-04-18 18:33:10	mark.dickinson	link	issue26785 messages
2016-04-18 18:33:10	mark.dickinson	create