Author rhettinger
Recipients congma, mark.dickinson, miss-islington, realead, rhettinger, serhiy.storchaka, tim.peters
Date 2021-06-14.04:40:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1623645626.06.0.202953890235.issue43475@roundup.psfhosted.org>
In-reply-to
Content
> If one wants to have all NaNs in one equivalency class
> (e.g. if used as a key-value for example in pandas) it
> is almost impossible to do so in a consistent way 
> without taking a performance hit.

ISTM the performance of the equivalent class case is far less important than the one we were trying to solve.  Given a choice we should prefer helping normal unadorned instances rather than giving preference to a subclass that redefines the usual behaviors.  

In CPython, it is a fact of life that overriding builtin behaviors with pure python code always incurs a performance hit.  Also, in your example, the subclass isn't technically correct because it relies on a non-guaranteed implementation details.  It likely isn't even the fastest approach.

The only guaranteed behaviors are that math.isnan(x) reliably detects a NaN and that x!=x when x is a NaN.  Those are the only assured tools in the uphill battle to fight the weird intrinsic nature of NaNs.

So one possible solution is to replace all the NaNs with a canonical placeholder value that doesn't have undesired properties:

    {None if isnan(x) else x for x in arr}

That relies on guaranteed behaviors and is reasonably fast.  IMO that beats trying to reprogram float('NaN') to behave the opposite of how it was designed.
History
Date User Action Args
2021-06-14 04:40:26rhettingersetrecipients: + rhettinger, tim.peters, mark.dickinson, serhiy.storchaka, miss-islington, realead, congma
2021-06-14 04:40:26rhettingersetmessageid: <1623645626.06.0.202953890235.issue43475@roundup.psfhosted.org>
2021-06-14 04:40:26rhettingerlinkissue43475 messages
2021-06-14 04:40:25rhettingercreate