This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients rhettinger, serhiy.storchaka
Date 2020-06-06.17:09:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1591463399.57.0.727941047538.issue40889@roundup.psfhosted.org>
In-reply-to
Content
Running "d1.items() ^ d2.items()" will rehash every key and value in both dictionaries regardless of how much they overlap.

By taking advantage of the known hashes, the analysis step could avoid making any calls to __hash__().  Only the result tuples would need to hashed.

Currently the code below calls hash for every key and value on the left and for every key and value on the right:

  >>> left = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7}
  >>> right = {1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 8: -8, 9: -9}
  >>> left.items() ^ right.items()        # Total work: 28 __hash__() calls
  {(6, -6), (7, -7), (8, -8), (9, -9)}

Compare that with the workload for set symmetric difference which makes zero calls to __hash__():

  >>> set(left) ^ set(right)
  {6, 7, 8, 9}

FWIW, I do have an important use case where this matters.
History
Date User Action Args
2020-06-06 17:09:59rhettingersetrecipients: + rhettinger, serhiy.storchaka
2020-06-06 17:09:59rhettingersetmessageid: <1591463399.57.0.727941047538.issue40889@roundup.psfhosted.org>
2020-06-06 17:09:59rhettingerlinkissue40889 messages
2020-06-06 17:09:59rhettingercreate