This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author josh.r
Recipients BreamoreBoy, flambda, josh.r, ncoghlan, pitrou, pmoody
Date 2014-06-21.01:03:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1403312637.5.0.335787222749.issue20446@psf.upfronthosting.co.za>
In-reply-to
Content
Correct me if I'm wrong, but wouldn't this only become a concern if:

1. You're storing both IPv4 and IPv6 addresses side-by-side
2. You're storing well over a billion IP addresses
3. Hash codes for the hex string of an IP address were predictably sequential (they're not)

On point #3 alone, you can check for yourself. In a quick test within a single process on Python 3.4, hash(hex(0x1)) == 7060637827927985012 while hash(hex(0x2)) == -4275917786525356978 (your numbers may vary thanks to per process string hash seeding, but they should be quite random). As such, you couldn't easily fill more than two sequential buckets reliably; you could guarantee collision chaining occurs at least once (since as you noted, you can create two IP addresses with the same hash reliably), but the chains will be evenly distributed; you can't build on that to get a second, third, ..., nth collision.

There wouldn't be a meaningful "imbalance" between low and high IP addresses either; below a billion or so IP addresses, random chance would dictate the occasional hash code would collide, and you could guarantee that collisions with the sub-32 bit values would collide one extra time before finding an empty bucket, but I seem to recall a typical dict insertion involves 1-5 collisions already; adding one extra to every single dict insertion/lookup costs something, but it's not that much, and the scenarios required to take advantage of it would be incredibly contrived.
History
Date User Action Args
2014-06-21 01:03:57josh.rsetrecipients: + josh.r, ncoghlan, pitrou, pmoody, BreamoreBoy, flambda
2014-06-21 01:03:57josh.rsetmessageid: <1403312637.5.0.335787222749.issue20446@psf.upfronthosting.co.za>
2014-06-21 01:03:57josh.rlinkissue20446 messages
2014-06-21 01:03:56josh.rcreate