Message 370839 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	rhettinger, serhiy.storchaka
Date	2020-06-06.17:09:59
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1591463399.57.0.727941047538.issue40889@roundup.psfhosted.org>
In-reply-to

Content
Running "d1.items() ^ d2.items()" will rehash every key and value in both dictionaries regardless of how much they overlap. By taking advantage of the known hashes, the analysis step could avoid making any calls to __hash__(). Only the result tuples would need to hashed. Currently the code below calls hash for every key and value on the left and for every key and value on the right: >>> left = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7} >>> right = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 8: -8, 9: -9} >>> left.items() ^ right.items() # Total work: 28 __hash__() calls {(6, -6), (7, -7), (8, -8), (9, -9)} Compare that with the workload for set symmetric difference which makes zero calls to __hash__(): >>> set(left) ^ set(right) {6, 7, 8, 9} FWIW, I do have an important use case where this matters.

Running "d1.items() ^ d2.items()" will rehash every key and value in both dictionaries regardless of how much they overlap.

By taking advantage of the known hashes, the analysis step could avoid making any calls to __hash__().  Only the result tuples would need to hashed.

Currently the code below calls hash for every key and value on the left and for every key and value on the right:

  >>> left = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7}
  >>> right = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 8: -8, 9: -9}
  >>> left.items() ^ right.items()        # Total work: 28 __hash__() calls
  {(6, -6), (7, -7), (8, -8), (9, -9)}

Compare that with the workload for set symmetric difference which makes zero calls to __hash__():

  >>> set(left) ^ set(right)
  {6, 7, 8, 9}

FWIW, I do have an important use case where this matters.

History
Date	User	Action	Args
2020-06-06 17:09:59	rhettinger	set	recipients: + rhettinger, serhiy.storchaka
2020-06-06 17:09:59	rhettinger	set	messageid: <1591463399.57.0.727941047538.issue40889@roundup.psfhosted.org>
2020-06-06 17:09:59	rhettinger	link	issue40889 messages
2020-06-06 17:09:59	rhettinger	create