This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hongqn
Recipients hongqn
Date 2013-10-11.04:03:40
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1381464221.0.0.181699047962.issue19224@psf.upfronthosting.co.za>
In-reply-to
Content
Integers, strings, and bool's hash are all consistent for processes of a same interpreter.  However, hash(None) differs.

$ python -c "print(hash(None))"
272931276
$ python -c "print(hash(None))"
277161420

It's wired and make difficulty for distributed systems partitioning data according hash of keys if the system wants the keys support None.

This patch makes hash(None) always return 0 to resolve that problem.  And it is used in DPark(Python clone of Spark, a MapReduce alike framework in Python, https://github.com/douban/dpark) to speed up portable hash (see line https://github.com/douban/dpark/blob/65a3ba857f11285667c61e2e134dacda44c13a2c/dpark/util.py#L47).

davies.liu@gmail.com is the original author of this patch.  All credit goes to him.
History
Date User Action Args
2013-10-11 04:03:41hongqnsetrecipients: + hongqn
2013-10-11 04:03:41hongqnsetmessageid: <1381464221.0.0.181699047962.issue19224@psf.upfronthosting.co.za>
2013-10-11 04:03:40hongqnlinkissue19224 messages
2013-10-11 04:03:40hongqncreate