Message 406925 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	congma, cwg, mark.dickinson, miss-islington, realead, rhettinger, serhiy.storchaka, tim.peters
Date	2021-11-24.12:58:32
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1637758713.08.0.254565968952.issue43475@roundup.psfhosted.org>
In-reply-to

Content
@cwg: Yep, we're aware of this. There are no good solutions here - only a mass of constraints, compromises and trade-offs. I think we're already somewhere on the Pareto boundary of the "best we can do" given the constraints. Moving to another point on the boundary doesn't seem worth the code churn. What concrete action would you propose that the Python core devs take at this point? > it was possible to convert a tuple of floats into a numpy array and back into a tuple, and the hash values of both tuples would be equal. This is no longer the case. Sure, but the problem isn't really with hash; that's just a detail. It lies deeper than that - it's with containment itself: >>> import numpy as np >>> import math >>> x = math.nan >>> some_list = [1.5, 2.3, x] >>> x in some_list True >>> x in list(np.array(some_list)) # expect True, get False False The result of the change linked to this PR is that the hash now also reflects that containment depends on object identity, not just object value. Reverting the change would solve the superficial hash problem, but not the underlying containment problem, and would re-introduce the performance issue that was fixed here.

@cwg: Yep, we're aware of this. There are no good solutions here - only a mass of constraints, compromises and trade-offs. I think we're already somewhere on the Pareto boundary of the "best we can do" given the constraints. Moving to another point on the boundary doesn't seem worth the code churn.

What concrete action would you propose that the Python core devs take at this point?

> it was possible to convert a tuple of floats into a numpy array and back into a tuple, and the hash values of both tuples would be equal.  This is no longer the case.

Sure, but the problem isn't really with hash; that's just a detail. It lies deeper than that - it's with containment itself:

>>> import numpy as np
>>> import math
>>> x = math.nan
>>> some_list = [1.5, 2.3, x]
>>> x in some_list
True
>>> x in list(np.array(some_list))  # expect True, get False
False

The result of the change linked to this PR is that the hash now also reflects that containment depends on object identity, not just object value. Reverting the change would solve the superficial hash problem, but not the underlying containment problem, and would re-introduce the performance issue that was fixed here.

History
Date	User	Action	Args
2021-11-24 12:58:33	mark.dickinson	set	recipients: + mark.dickinson, tim.peters, rhettinger, serhiy.storchaka, miss-islington, cwg, realead, congma
2021-11-24 12:58:33	mark.dickinson	set	messageid: <1637758713.08.0.254565968952.issue43475@roundup.psfhosted.org>
2021-11-24 12:58:33	mark.dickinson	link	issue43475 messages
2021-11-24 12:58:32	mark.dickinson	create