This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author gurnec
Recipients gurnec
Date 2015-04-16.19:09:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1429211399.3.0.0363933708398.issue23974@psf.upfronthosting.co.za>
In-reply-to
Content
Due to an optimization in random.randrange() only in Python 2, as the stop-start range approaches 2^53 the output becomes noticeably biased. This bug also affects random.SystemRandom.

For example, counting the number of even ints in a set of 10^6 random ints each in the range [0, 2^52 * 2/3) produces normal results; about half are even:

>>> sum(randrange(2**52 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
0.499932

Change the range to [0, 2^53 * 2/3), and you get a degenerate case where evens are two times more likely to occur than odds:

>>> sum(randrange(2**53 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
0.333339

The issue occurs in three places inside randrange(), here's one:

if istart >= _maxwidth:
    return self._randbelow(istart)
return _int(self.random() * istart)

_maxwidth is the max size of a double where every digit to the left of the decimal point can still be represented w/o loss of precision (2^53, where a double has 53 mantissa bits). With istart >= _maxwidth, _randbelow() behaves correctly. With istart < _maxwidth, the rounding error in random() * istart begins to cause problems as istart approaches 2^53.

Changing _maxwidth to be significantly less should (practically speaking anyways) fix this, although I'm open to suggestion on just how conservatively it should be set.
History
Date User Action Args
2015-04-16 19:09:59gurnecsetrecipients: + gurnec
2015-04-16 19:09:59gurnecsetmessageid: <1429211399.3.0.0363933708398.issue23974@psf.upfronthosting.co.za>
2015-04-16 19:09:59gurneclinkissue23974 messages
2015-04-16 19:09:58gurneccreate