This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients Eric Appelt, berker.peksag, christian.heimes, martin.panter, python-dev, rhettinger, serhiy.storchaka, tim.peters, vstinner
Date 2018-01-15.18:44:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1516041893.07.0.467229070634.issue26163@psf.upfronthosting.co.za>
In-reply-to
Content
I'm getting a nice improvement in dispersion statistics by shuffling in higher bits right at the end:

     /* Disperse patterns arising in nested frozensets */
  +  hash ^= (hash >> 11) ^ (~hash >> 25);
     hash = hash * 69069U + 907133923UL;

Results for range() check:

                     range       range
                    baseline      new
  1st percentile     35.06%      40.63%
  1st decile         48.03%      51.34%
  mean               61.47%      63.24%      
  median             63.24%      65.58% 

Test code for the letter_range() test:

                     letter      letter
                    baseline      new
  1st percentile     39.59%      40.14%
  1st decile         50.90%      51.07%
  mean               63.02%      63.04%      
  median             65.21%      65.23% 


    def letter_range(n):
        return string.ascii_letters[:n]

    def powerset(s):
        for i in range(len(s)+1):
            yield from map(frozenset, itertools.combinations(s, i))

    # range() check
    for i in range(10000):
        for n in range(5, 19):
            t = 2 ** n
            mask = t - 1
            u = len({h & mask for h in map(hash, powerset(range(i, i+n)))})
            print(u/t*100)

    # letter_range() check needs to be restarted (reseeded on every run)
    for n in range(5, 19):
        t = 2 ** n
        mask = t - 1
        u = len({h & mask for h in map(hash, powerset(letter_range(n)))})
        print(u/t)
History
Date User Action Args
2018-01-15 18:44:53rhettingersetrecipients: + rhettinger, tim.peters, vstinner, christian.heimes, python-dev, berker.peksag, martin.panter, serhiy.storchaka, Eric Appelt
2018-01-15 18:44:53rhettingersetmessageid: <1516041893.07.0.467229070634.issue26163@psf.upfronthosting.co.za>
2018-01-15 18:44:53rhettingerlinkissue26163 messages
2018-01-15 18:44:53rhettingercreate