Author rhettinger
Recipients mark.dickinson, oscarbenjamin, rhettinger, tim.peters
Date 2020-07-19.06:06:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1595138761.05.0.600179326536.issue41311@roundup.psfhosted.org>
In-reply-to
Content
> I agree that this could be out of scope for the random module
> but I wanted to make sure the reasons were considered.

I think we've done that.  Let's go ahead and close this one down.

In general, better luck can be had by starting with a common real world problem not adequately solved by the library, then creating a clean API for it, and lastly searching for the best algorithm to implement it.  It is much tougher the other way around, starting with an algorithm you like, then hoping to find a use case to justify it, and hoping to find an API that isn't a footgun for everyday users.

FWIW, reservoir sampling was considered at the outset when sample() was first designed.  Subsequent to that we've also evaluated a high quality PR for switching the internals to reservoir sampling, but it proved to be inferior to the current implementation in most respects (code complexity, computational overhead, speed, and entropy consumed); the only gain was some memory savings.
History
Date User Action Args
2020-07-19 06:06:01rhettingersetrecipients: + rhettinger, tim.peters, mark.dickinson, oscarbenjamin
2020-07-19 06:06:01rhettingersetmessageid: <1595138761.05.0.600179326536.issue41311@roundup.psfhosted.org>
2020-07-19 06:06:01rhettingerlinkissue41311 messages
2020-07-19 06:06:00rhettingercreate