Issue6598
Created on 2009-07-30 00:32 by mwh, last changed 2009-11-15 10:12 by gagenellina.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
make_msgid.diff
|
gagenellina,
2009-11-15 10:12
|
|
|
|
|
msg91073 - (view) |
Author: Michael Hudson (mwh) |
Date: 2009-07-30 00:32 |
|
If you call email.utils.make_msgid a number of times within the same
second, the uniqueness of the results depends on random.randint(100000)
returning different values each time.
A little mathematics proves that you don't have to call make_msgid
*that* often to get the same message id twice: if you call it 'n' times,
the probability of a collision is approximately "1 -
math.exp(-n*(n-1)/200000.0)", and for n == 100, that's about 5%. For n
== 1000, it's over 99%.
These numbers are born out by experiment:
>>> def collisions(n):
... msgids = [make_msgid() for i in range(n)]
... return len(msgids) - len(set(msgids))
...
>>> sum((collisions(100)>0) for i in range(1000))
49
>>> sum((collisions(1000)>0) for i in range(1000))
991
I think probably having a counter in addition to the randomness would be
a good fix for the problem, though I guess then you have to worry about
thread safety.
|
|
msg91074 - (view) |
Author: Michael Hudson (mwh) |
Date: 2009-07-30 00:34 |
|
A higher resolution timer would also help, of course.
(Thanks to James Knight for the prod).
|
|
msg91251 - (view) |
Author: Gabriel Genellina (gagenellina) |
Date: 2009-08-04 04:30 |
|
This patch replaces the random part with an increasing sequence (in a
thread safe way).
Also, added a test case for make_msgid (there was none previously)
|
|
msg91378 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-08-06 18:18 |
|
Is it ok if the message id is predictable?
Besides, _gen_next_number() can more efficiently be written as:
_gen_next_number = itertools.cycle(xrange(N))
|
|
msg91410 - (view) |
Author: Gabriel Genellina (gagenellina) |
Date: 2009-08-07 19:34 |
|
--- El jue 6-ago-09, Antoine Pitrou <report@bugs.python.org> escribió:
> Antoine Pitrou <pitrou@free.fr>
> added the comment:
>
> Is it ok if the message id is predictable?
I don't know of any use of message ids apart from uniquely identifying the message, but we could still keep a random part followed by the sequence.
> Besides, _gen_next_number() can more efficiently be written
> as:
>
> _gen_next_number = itertools.cycle(xrange(N))
Written that way it will eventually hold a list with 100000 integers forever.
Yahoo! Cocina
Encontra las mejores recetas con Yahoo! Cocina.
http://ar.mujer.yahoo.com/cocina/
|
|
msg95281 - (view) |
Author: Gabriel Genellina (gagenellina) |
Date: 2009-11-15 10:12 |
|
An updated version of the make_msgid patch.
Includes a random part plus a sequential part. Testing takes at most 3
seconds now (we're interested in those msgids generated in a whole
second)
|
|
| Date |
User |
Action |
Args |
| 2009-11-15 10:12:49 | gagenellina | set | files:
- utils.diff |
| 2009-11-15 10:12:43 | gagenellina | set | files:
- test_email.diff |
| 2009-11-15 10:12:29 | gagenellina | set | files:
+ make_msgid.diff
messages:
+ msg95281 |
| 2009-08-07 19:59:53 | gagenellina | set | files:
+ utils.diff |
| 2009-08-07 19:59:18 | gagenellina | set | files:
- utils.diff |
| 2009-08-07 19:34:15 | gagenellina | set | messages:
+ msg91410 |
| 2009-08-06 18:18:59 | pitrou | set | nosy:
+ pitrou messages:
+ msg91378
|
| 2009-08-04 04:32:45 | gagenellina | set | versions:
- Python 2.5, Python 2.4, 3rd party, Python 3.0 |
| 2009-08-04 04:31:30 | gagenellina | set | files:
+ test_email.diff |
| 2009-08-04 04:30:54 | gagenellina | set | files:
+ utils.diff
nosy:
+ gagenellina messages:
+ msg91251
keywords:
+ patch |
| 2009-07-30 08:21:31 | allenap | set | nosy:
+ allenap
|
| 2009-07-30 00:34:46 | mwh | set | messages:
+ msg91074 |
| 2009-07-30 00:32:15 | mwh | create | |
|