Message 327423 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients	eric.smith, jdemeyer, mark.dickinson, rhettinger, sir-sigurd, tim.peters
Date	2018-10-09.16:06:31
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1539101191.52.0.545547206417.issue34751@psf.upfronthosting.co.za>
In-reply-to

Content
>> Changes initialization to add in the length: > What's the rationale for that change? You always asked > me to stay as close as possible to the "official" hash function > which adds in the length at the end. Is there an actual benefit > from doing it in the beginning? The heart of xxHash is the state-updating code, neither initialization nor finalization. Likewise for SeaHash or any of the others. Without the post-loop avalanche code, adding the length at the end has very little effect - it will typically change only the last two bits of the final result. _With_ the avalanche code, the length can affect every bit in the result. But adding it in at the start also achieves that - same as changing any bit in the initial accumulator value. Adding it in at the start instead also takes the addition off the critical path. Which may or may not save a cycle or two (depending on processor and compiler), but can't hurt speed. I noted before that adding the length at the end _can_ break out of a zero fixed-point (accumulator becomes 0 and all remaining values hash to 0). Adding it at the start loses that. So there is a theoretical danger there ... OK, trying it both ways I don't see any significant differences in my test results or a timing difference outside the noise range. So I'm happy either way.

>> Changes initialization to add in the length:

> What's the rationale for that change? You always asked
> me to stay as close as possible to the "official" hash function
> which adds in the length at the end. Is there an actual benefit
> from doing it in the beginning?

The heart of xxHash is the state-updating code, neither initialization nor finalization.  Likewise for SeaHash or any of the others.

Without the post-loop avalanche code, adding the length at the end has very little effect - it will typically change only the last two bits of the final result.  _With_ the avalanche code, the length can affect every bit in the result.  But adding it in at the start also achieves that - same as changing any bit in the initial accumulator value.

Adding it in at the start instead also takes the addition off the critical path.  Which may or may not save a cycle or two (depending on processor and compiler), but can't hurt speed.

I noted before that adding the length at the end _can_ break out of a zero fixed-point (accumulator becomes 0 and all remaining values hash to 0).  Adding it at the start loses that.

So there is a theoretical danger there ... OK, trying it both ways I don't see any significant differences in my test results or a timing difference outside the noise range.  So I'm happy either way.

History
Date	User	Action	Args
2018-10-09 16:06:31	tim.peters	set	recipients: + tim.peters, rhettinger, mark.dickinson, eric.smith, jdemeyer, sir-sigurd
2018-10-09 16:06:31	tim.peters	set	messageid: <1539101191.52.0.545547206417.issue34751@psf.upfronthosting.co.za>
2018-10-09 16:06:31	tim.peters	link	issue34751 messages
2018-10-09 16:06:31	tim.peters	create