Issue 37439: Add random.binomialvariate()

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/81620

classification

Title:	Add random.binomialvariate()
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.11

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	avi, gregory.p.smith, mark.dickinson, rhettinger, tim.peters, zkneupper
Priority:	low	Keywords:	patch

Created on 2019-06-28 09:28 by rhettinger, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL	Status	Linked	Edit
PR 14530	open	avi, 2019-07-01 18:03

Messages (10)
msg346815 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2019-06-28 09:28
This a small snippet of code I written dozens of times. It would be nice to have this in the library along with the other distributions. def binomialvariate(n, p): ''' Binomial distribution for the number of successes in n Bernoulli trials each with a probability p of success. Returns an integer in the range: 0 <= X <= n ''' return sum(random() < p for i in range(n))
msg347055 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2019-07-01 17:56
If you want to be able to switch to something more efficient for large `n`, Knuth describes a simple O(log(n)) algorithm in TAOCP volume 4 (and attributes it to J. H. Ahrens). It's essentially a bisection search on the value, using the fact that we can use the beta distribution to generate order statistics. Here's a (too) simple Python implementation. It still needs thorough testing, and could be optimised in many ways - e.g., using sensible crossover point for n and not recursing all the way to n = 0. def binomialvariate(n, p): if n == 0: return 0 a, b = (1 + n)//2, 1 + n//2 x = betavariate(a, b) if x < p: return a + binomialvariate(b - 1, (p - x)/(1 - x)) else: return binomialvariate(a - 1, p/x) >>> binomialvariate(10**10, 0.5) 4999944649
msg347057 - (view)	Author: Avinash Sajjanshetty (avi) *	Date: 2019-07-01 18:06
@Mark - Newb here and before I could see your reply, I sent a PR cos I thought simple implementation provided by Raymond Hettinger was good enough. Shall I update the PR? I will add the tests soon.
msg347061 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2019-07-01 18:47
@avi That's Raymond's call, but IMO there's absolutely nothing wrong with getting a reference implementation (with tests, documentation, and all the usual trimmings) in first and then making algorithmic improvements later. Thanks for the PR!
msg347066 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2019-07-01 19:10
> [...] and then making algorithmic improvements later. Though thinking about it, there is one catch: for the random module, it's nice if reproducibility isn't broken more often than necessary in released versions, so it would be best to do any algorithmic tweaking before 3.9.0 is released. But that still gives us some time ...
msg350256 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2019-08-23 06:55
[Avi] > Shall I update the PR? Yes, please.
msg393903 - (view)	Author: Zachary Kneupper (zkneupper) *	Date: 2021-05-18 20:13
Is adding random.binomialvariate() still under consideration? I would be willing to work on this addition to Lib/random.py, as well as the accompanying tests and docs. Should I use the type of algorithm hinted at by Mark on 2019-07-01 17:56?
msg394384 - (view)	Author: Gregory P. Smith (gregory.p.smith) *	Date: 2021-05-25 17:51
A presumed optimal version of this is already available in numpy. https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.binomial.html https://github.com/numpy/numpy/blob/2232a473f8713f532c8164c8cf616f7bd05f54a7/numpy/random/_generator.pyx#L2805
msg394391 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2021-05-25 20:19
I think the NumPy implementation may be from here: https://core.ac.uk/download/pdf/11007254.pdf (though I'm struggling to find a clear citation in the NumPy source)
msg394396 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2021-05-25 20:32
Nope, that's the wrong paper. It looks as though this is the right one, but it's hidden behind a paywall: https://dl.acm.org/doi/abs/10.1145/42372.42381

History
Date	User	Action	Args
2022-04-11 14:59:17	admin	set	github: 81620
2021-05-25 20:32:21	mark.dickinson	set	messages: + msg394396
2021-05-25 20:19:46	mark.dickinson	set	messages: + msg394391
2021-05-25 17:51:45	gregory.p.smith	set	nosy: + gregory.p.smith messages: + msg394384 versions: + Python 3.11, - Python 3.9
2021-05-18 20:13:39	zkneupper	set	nosy: + zkneupper messages: + msg393903
2019-08-23 06:55:18	rhettinger	set	messages: + msg350256
2019-07-01 19:10:06	mark.dickinson	set	messages: + msg347066
2019-07-01 18:47:37	mark.dickinson	set	messages: + msg347061
2019-07-01 18:06:21	avi	set	nosy: + avi messages: + msg347057
2019-07-01 18:03:27	avi	set	keywords: + patch stage: patch review pull_requests: + pull_request14343
2019-07-01 17:56:20	mark.dickinson	set	messages: + msg347055
2019-06-28 15:05:20	rhettinger	set	messages: - msg346817
2019-06-28 10:19:21	rhettinger	set	messages: + msg346817
2019-06-28 09:28:09	rhettinger	create