This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add random.binomialvariate()
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: avi, gregory.p.smith, mark.dickinson, rhettinger, tim.peters, zkneupper
Priority: low Keywords: patch

Created on 2019-06-28 09:28 by rhettinger, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 14530 open avi, 2019-07-01 18:03
Messages (10)
msg346815 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-06-28 09:28
This a small snippet of code I written dozens of times.  It would be nice to have this in the library along with the other distributions.

def binomialvariate(n, p):
    ''' Binomial distribution for the number of successes
        in *n* Bernoulli trials each with a probability *p*
        of success.

        Returns an integer in the range:  0 <= X <= n
    '''
    return sum(random() < p for i in range(n))
msg347055 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 17:56
If you want to be able to switch to something more efficient for large `n`, Knuth describes a simple O(log(n)) algorithm in TAOCP volume 4 (and attributes it to J. H. Ahrens). It's essentially a bisection search on the value, using the fact that we can use the beta distribution to generate order statistics. Here's a (too) simple Python implementation. It still needs thorough testing, and could be optimised in many ways - e.g., using sensible crossover point for n and not recursing all the way to n = 0.


    def binomialvariate(n, p):
        if n == 0:
            return 0
        a, b = (1 + n)//2, 1 + n//2
        x = betavariate(a, b)
        if x < p:
            return a + binomialvariate(b - 1, (p - x)/(1 - x))
        else:
            return binomialvariate(a - 1, p/x)


>>> binomialvariate(10**10, 0.5)
4999944649
msg347057 - (view) Author: Avinash Sajjanshetty (avi) * Date: 2019-07-01 18:06
@Mark - Newb here and before I could see your reply, I sent a PR cos I thought simple implementation provided by Raymond Hettinger was good enough. Shall I update the PR? 

I will add the tests soon.
msg347061 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 18:47
@avi That's Raymond's call, but IMO there's absolutely nothing wrong with getting a reference implementation (with tests, documentation, and all the usual trimmings) in first and then making algorithmic improvements later. 

Thanks for the PR!
msg347066 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-07-01 19:10
> [...] and then making algorithmic improvements later.

Though thinking about it, there is *one* catch: for the random module, it's nice if reproducibility isn't broken more often than necessary in released versions, so it would be best to do any algorithmic tweaking *before* 3.9.0 is released. But that still gives us some time ...
msg350256 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-08-23 06:55
[Avi]
> Shall I update the PR? 

Yes, please.
msg393903 - (view) Author: Zachary Kneupper (zkneupper) * Date: 2021-05-18 20:13
Is adding random.binomialvariate() still under consideration?

I would be willing to work on this addition to Lib/random.py, as well as the accompanying tests and docs.

Should I use the type of algorithm hinted at by Mark on 2019-07-01 17:56?
msg394384 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-25 17:51
A presumed optimal version of this is already available in numpy.

https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.binomial.html

https://github.com/numpy/numpy/blob/2232a473f8713f532c8164c8cf616f7bd05f54a7/numpy/random/_generator.pyx#L2805
msg394391 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2021-05-25 20:19
I think the NumPy implementation may be from here: https://core.ac.uk/download/pdf/11007254.pdf

(though I'm struggling to find a clear citation in the NumPy source)
msg394396 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2021-05-25 20:32
Nope, that's the wrong paper. It looks as though this is the right one, but it's hidden behind a paywall: https://dl.acm.org/doi/abs/10.1145/42372.42381
History
Date User Action Args
2022-04-11 14:59:17adminsetgithub: 81620
2021-05-25 20:32:21mark.dickinsonsetmessages: + msg394396
2021-05-25 20:19:46mark.dickinsonsetmessages: + msg394391
2021-05-25 17:51:45gregory.p.smithsetnosy: + gregory.p.smith

messages: + msg394384
versions: + Python 3.11, - Python 3.9
2021-05-18 20:13:39zkneuppersetnosy: + zkneupper
messages: + msg393903
2019-08-23 06:55:18rhettingersetmessages: + msg350256
2019-07-01 19:10:06mark.dickinsonsetmessages: + msg347066
2019-07-01 18:47:37mark.dickinsonsetmessages: + msg347061
2019-07-01 18:06:21avisetnosy: + avi
messages: + msg347057
2019-07-01 18:03:27avisetkeywords: + patch
stage: patch review
pull_requests: + pull_request14343
2019-07-01 17:56:20mark.dickinsonsetmessages: + msg347055
2019-06-28 15:05:20rhettingersetmessages: - msg346817
2019-06-28 10:19:21rhettingersetmessages: + msg346817
2019-06-28 09:28:09rhettingercreate