This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients BreamoreBoy, allenap, ggenellina, mwh, pitrou, r.david.murray, serhiy.storchaka, varun
Date 2015-02-24.17:51:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1424800320.32.0.828642766402.issue6598@psf.upfronthosting.co.za>
In-reply-to
Content
The probability of a collision when generated n numbers from 0 to m-1 is

    1 - factorial(m)/factorial(m-n)/m**n

When n << sqrt(m), it is approximately equal to n**2/(2*m). When m = 1000000, we have problems for n about 1000. When increase m to 2**32, the probability of collisions is decreased to 0.01%. When increase m to 2**64 as recommended in [1], it is decreased to 2.7e-14. I.e. when generate 1000 IDs per second, collisions happen one per a million years.

We could also take datetime with larger precision. E.g with 2 digits after the dot, as recommended in [1].

When apply both changes, we could generate 100000 IDs per second with one collision per 10000 years or 10000 IDs per second with one collision per 1000000 years.

[1] https://tools.ietf.org/html/draft-ietf-usefor-message-id-01
History
Date User Action Args
2015-02-24 17:52:00serhiy.storchakasetrecipients: + serhiy.storchaka, mwh, ggenellina, pitrou, allenap, r.david.murray, BreamoreBoy, varun
2015-02-24 17:52:00serhiy.storchakasetmessageid: <1424800320.32.0.828642766402.issue6598@psf.upfronthosting.co.za>
2015-02-24 17:52:00serhiy.storchakalinkissue6598 messages
2015-02-24 17:51:59serhiy.storchakacreate