This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mwh
Recipients mwh
Date 2009-07-30.00:32:13
SpamBayes Score 3.1557942e-05
Marked as misclassified No
Message-id <1248913936.65.0.487457323949.issue6598@psf.upfronthosting.co.za>
In-reply-to
Content
If you call email.utils.make_msgid a number of times within the same
second, the uniqueness of the results depends on random.randint(100000)
returning different values each time.

A little mathematics proves that you don't have to call make_msgid
*that* often to get the same message id twice: if you call it 'n' times,
the probability of a collision is approximately "1 -
math.exp(-n*(n-1)/200000.0)", and for n == 100, that's about 5%.  For n
== 1000, it's over 99%.

These numbers are born out by experiment:

>>> def collisions(n):
...     msgids = [make_msgid() for i in range(n)]
...     return len(msgids) - len(set(msgids))
... 
>>> sum((collisions(100)>0) for i in range(1000))
49
>>> sum((collisions(1000)>0) for i in range(1000))
991

I think probably having a counter in addition to the randomness would be
a good fix for the problem, though I guess then you have to worry about
thread safety.
History
Date User Action Args
2009-07-30 00:32:16mwhsetrecipients: + mwh
2009-07-30 00:32:16mwhsetmessageid: <1248913936.65.0.487457323949.issue6598@psf.upfronthosting.co.za>
2009-07-30 00:32:14mwhlinkissue6598 messages
2009-07-30 00:32:13mwhcreate