This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Rhamphoryncus
Recipients Rhamphoryncus, jcea, mark.dickinson, pitrou, rhettinger
Date 2009-02-11.03:31:49
SpamBayes Score 6.49149e-10
Marked as misclassified No
Message-id <1234323113.24.0.149926248907.issue5186@psf.upfronthosting.co.za>
In-reply-to
Content
On my 64-bit linux box there's nothing in the last 4 bits:

>>> [id(o)%16 for o in [object() for i in range(128)]]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]

And with a bit more complicated functions I can determine how much shift
gives us the lowest collision rate:

def a(size, shift):
    return len(set((id(o) >> shift) % (size * 2) for o in [object() for
i in range(size)]))

def b(size):
    return [a(size, shift) for shift in range(11)]

def c():
    for i in range(1, 9):
        size = 2**i
        x = ', '.join('% 3s' % count for count in b(size))
        print('% 3s: %s' % (size, x))

>>> c()
  2:   1,   1,   1,   2,   2,   1,   1,   1,   2,   2,   2
  4:   1,   1,   2,   3,   4,   3,   2,   4,   4,   3,   2
  8:   1,   2,   4,   6,   6,   7,   8,   6,   4,   3,   2
 16:   2,   4,   7,   9,  12,  13,  12,   8,   5,   3,   2
 32:   4,   8,  14,  23,  30,  25,  19,  12,   7,   4,   2
 64:   8,  16,  32,  55,  64,  38,  22,  13,   8,   4,   2
128:  16,  32,  64, 114, 128,  71,  39,  22,  12,   6,   3
256:  32,  64, 128, 242, 242, 123,  71,  38,  20,  10,   5

The fifth column (ie 4 bits of shift, a divide of 16) works the best. 
Although it varies from run to run, probably more than half the results
in that column have no collisions at all.

.. although, if I replace object() with list() I get best results with a
shift of 6 bits.  Replacing it with dict() is best with 8 bits.

We may want something more complicated.
History
Date User Action Args
2009-02-11 03:31:53Rhamphoryncussetrecipients: + Rhamphoryncus, rhettinger, jcea, mark.dickinson, pitrou
2009-02-11 03:31:53Rhamphoryncussetmessageid: <1234323113.24.0.149926248907.issue5186@psf.upfronthosting.co.za>
2009-02-11 03:31:50Rhamphoryncuslinkissue5186 messages
2009-02-11 03:31:49Rhamphoryncuscreate