classification
Title: randint is always even
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: rhettinger, ronrivest, tim.peters
Priority: normal Keywords:

Created on 2003-09-25 04:52 by ronrivest, last changed 2003-10-05 23:41 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
random4.diff rhettinger, 2003-09-29 16:02 Improved patch to random.py and test_random.py
rand5.diff rhettinger, 2003-10-03 02:45 Patch to Py2.4
Messages (10)
msg18340 - (view) Author: Ronald L. Rivest (ronrivest) Date: 2003-09-25 04:52
The result of 
    random.randint(2**64,2**65-1) 
is always even!

(I was trying to find some large prime numbers, and
was puzzled by the fact that none were turning
up.  Finally, I discovered that randint wasn't really
returning "random" integers as desired, but only
even ones.)

I'm not sure what the cause of the problem is, but
randint should work properly, even when the given
endpoints are large...

Thanks...

        Ron Rivest
msg18341 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-09-25 05:16
Logged In: YES 
user_id=31435

The cause is that randint takes a pseudo-random float (C 
double in [0, 1),), multiplies it by (hi-lo+1), then truncates 
and adds that to lo.  Since a double on your box almost 
certainly has only 53 bits of precision, and your multiplier 
effectively moves the radix point 64 bits "to the right", I 
expect the last 11 bits you see will always be 0.

Under Python 2.3, which uses the Mersenne Twister to 
generate random doubles, this does appear to be the case:

>>> from random import randint
>>> def f():
...     return randint(2**64, 2**65-1)
...
>>> hex(f())
'0x11EDD95B72CEF4000L'
>>> hex(f())
'0x1DD69BF4B39C65000L'
>>> hex(f())
'0x13328DC9C1C733800L'
>>> hex(f())
'0x1E65B47170C057800L'
>>>

Under earlier versions of Python, it may be even worse than 
that.

It takes a fundamentally different kind of algorithm to 
generate arbitrarily long random ints.  Here's a link to one 
such for Python:

http://www.faqts.com/knowledge_base/view.phtml/aid/4406

I doubt the implementation of randint() will change, since 
most people want fast-as-possible generation of mountains of 
small integers, and there's a tradeoff between catering to 
that and catering to extreme inputs.  The randint docs should 
change, though (I think randint() used to raise an exception 
when fed arguments as large as the ones you're using; I 
suspect, but don't know, that we lost the helpful exception 
as an unintended consequence of the long-term effort to 
eradicate the user-visible distinction between Python ints 
(machine C longs) and Python longs (unbounded integers)).

BTW, given what you're doing, you may want to install one of 
the Python wrappers for GNU GMP.  Python's unbounded-int 
arithmetic implementation is extremely portable and reliable, 
but buys those in part by not caring much about speed.
msg18342 - (view) Author: Ronald L. Rivest (ronrivest) Date: 2003-09-25 14:56
Logged In: YES 
user_id=863876

If the code is not going to be fixed, then the
documentation should be updated.  I am well aware
of other approaches to generating large random numbers,
but was misled by the documentation Python provides.  The
documentation should make it clear when the code violates
its "contract" (i.e. doesn't generate all integers in the range).
I prefer the code being fixed best, but also feel that the code 
should raise an exception if it can't honor the contract implied 
in its documentation.
msg18343 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-09-25 15:23
Logged In: YES 
user_id=31435

No disagreement here.  I'd rather fix the code than the docs, 
and a fix can take two forms:  raise an exception if the full 
range of ints can't be delivered, or deliver the full range of 
ints.

If anyone wants to take this on, I think the latter ("do a right 
thing") may be much more practical than it used to be, given 
the new-in-2.3 Twister impelementation -- we can generate 
long pseudo-random bitstrings quickly now (at least at the C 
level), but couldn't before 2.3.
msg18344 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-25 17:10
Logged In: YES 
user_id=80475

If the range is too large, instead of raising an exception,
how about  
calling a _randbigint routine that builds up the random
selection using repeated calls to the underlying generator. 
msg18345 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-09-26 00:07
Logged In: YES 
user_id=31435

Ya, something like that indeed.  You might enjoy writing the 
core in C <wink>.
msg18346 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-27 04:40
Logged In: YES 
user_id=80475

Attached is a first draft of a patch.

It works but I don't like the globals for 53 bits per random
float because the WichmannHill generator supplies fewer bits
than that.  I don't see a general way to find out how many
bits per random call are offered by an underlying generator.
 A more conservative, but slower approach is to assume on 32
bits and, if there are more, just throw them away.  Ideas
are welcome.
msg18347 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-28 07:42
Logged In: YES 
user_id=80475

A cleaner, faster patch is attached along with tests. 
Before it goes in, the second assert can be commented out.

Unless Tim comes-up with a better idea about obtaining the
information content of a call to random(), I think the
53-bit assumption is fine (a bar for core generators to live
up to).  
msg18348 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-10-03 02:45
Logged In: YES 
user_id=80475

Here is a patch for Py2.4 that:
* implements getrandbits(k) in C
* modifies randrange() to use getrandbits() when available
* modified randrange() to warn for large ranges whenever 
getrandbits() is not available
* creates a BaseRandom class for generator subclassing

A separate patch for Py2.3.3 will be loaded that extends 
randrange() for only the MersenneTwister and makes no API 
changes or assumptions about other generators.
msg18349 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-10-05 23:41
Logged In: YES 
user_id=80475

Fixed.  See:
  Lib/random.py 1.56 and 
  Modules/_randommodule.c 1.7

Thanks for the bug report.
History
Date User Action Args
2003-09-25 04:52:24ronrivestcreate