classification
Title: Add random.binomialvariate()
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: avi, mark.dickinson, rhettinger, tim.peters
Priority: low Keywords: patch

Created on 2019-06-28 09:28 by rhettinger, last changed 2019-08-23 06:55 by rhettinger.

Pull Requests
URL Status Linked Edit
PR 14530 open avi, 2019-07-01 18:03
Messages (6)
msg346815 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-06-28 09:28
This a small snippet of code I written dozens of times.  It would be nice to have this in the library along with the other distributions.

def binomialvariate(n, p):
    ''' Binomial distribution for the number of successes
        in *n* Bernoulli trials each with a probability *p*
        of success.

        Returns an integer in the range:  0 <= X <= n
    '''
    return sum(random() < p for i in range(n))
msg347055 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 17:56
If you want to be able to switch to something more efficient for large `n`, Knuth describes a simple O(log(n)) algorithm in TAOCP volume 4 (and attributes it to J. H. Ahrens). It's essentially a bisection search on the value, using the fact that we can use the beta distribution to generate order statistics. Here's a (too) simple Python implementation. It still needs thorough testing, and could be optimised in many ways - e.g., using sensible crossover point for n and not recursing all the way to n = 0.


    def binomialvariate(n, p):
        if n == 0:
            return 0
        a, b = (1 + n)//2, 1 + n//2
        x = betavariate(a, b)
        if x < p:
            return a + binomialvariate(b - 1, (p - x)/(1 - x))
        else:
            return binomialvariate(a - 1, p/x)


>>> binomialvariate(10**10, 0.5)
4999944649
msg347057 - (view) Author: Avinash Sajjanshetty (avi) * Date: 2019-07-01 18:06
@Mark - Newb here and before I could see your reply, I sent a PR cos I thought simple implementation provided by Raymond Hettinger was good enough. Shall I update the PR? 

I will add the tests soon.
msg347061 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 18:47
@avi That's Raymond's call, but IMO there's absolutely nothing wrong with getting a reference implementation (with tests, documentation, and all the usual trimmings) in first and then making algorithmic improvements later. 

Thanks for the PR!
msg347066 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 19:10
> [...] and then making algorithmic improvements later.

Though thinking about it, there is *one* catch: for the random module, it's nice if reproducibility isn't broken more often than necessary in released versions, so it would be best to do any algorithmic tweaking *before* 3.9.0 is released. But that still gives us some time ...
msg350256 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-23 06:55
[Avi]
> Shall I update the PR? 

Yes, please.
History
Date User Action Args
2019-08-23 06:55:18rhettingersetmessages: + msg350256
2019-07-01 19:10:06mark.dickinsonsetmessages: + msg347066
2019-07-01 18:47:37mark.dickinsonsetmessages: + msg347061
2019-07-01 18:06:21avisetnosy: + avi
messages: + msg347057
2019-07-01 18:03:27avisetkeywords: + patch
stage: patch review
pull_requests: + pull_request14343
2019-07-01 17:56:20mark.dickinsonsetmessages: + msg347055
2019-06-28 15:05:20rhettingersetmessages: - msg346817
2019-06-28 10:19:21rhettingersetmessages: + msg346817
2019-06-28 09:28:09rhettingercreate