This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random.choice should accept a set as input
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: lleeoo, mark.dickinson, michelem, rhettinger, thomasahle
Priority: normal Keywords:

Created on 2009-12-16 07:52 by lleeoo, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg96480 - (view) Author: Leo (lleeoo) Date: 2009-12-16 07:52
The following code should just work:

import random
random.choice(set(range(5)))

instead the output is TypeError:

TypeError: 'set' object does not support indexing

The algorithm in random.choice requires a sequence, but the semantics of
choice do not, and should not, require a sequence. 

The code should be changed to convert the input to a sequence instead.

Cheers,
Leo.
msg96486 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-12-16 12:50
This would be a new feature, so would have to wait for Python 2.7 / 3.2  
(2.6 is only receiving bugfixes).
msg96492 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-12-16 17:57
The underlying data structure for sets doesn't lend itself to an
efficient method of random selection.  It is best for the programmer to
explictly convert to a sequence and then make the random selection (that
way the conversion cost isn't hidden). 

If you don't mind the inefficiency of an implicit conversion to a list,
you can use "random.sample(s, 1)[0]".  If only an arbitrary element is
needed, use "x=next(iter(s))" to non-destructively fetch the first element.
msg96540 - (view) Author: Leo (lleeoo) Date: 2009-12-17 22:59
Thanks for the suggestions "random.sample(s, 1)[0]" and 
"x=next(iter(s))". 

A counterpoint: isn't this a classic example of where polymorphism 
enables a more elegant, simple, and clear (dare I say Pythonic) 
approach? The complexity of allowing sets as inputs, even with a naive 
implementation, is still O(n) even in the worst case vs. O(1) for 
arrays. Isn't documenting that distinction good enough?

Thanks,
Leo.
msg105138 - (view) Author: Thomas Dybdahl Ahle (thomasahle) * Date: 2010-05-06 11:29
Why not just add support to the set container?
As far as I know, it is a binary search tree, so supporting random picking in O(logn) should be easy.
msg105139 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-05-06 11:40
> As far as I know, it is a binary search tree,

It's not:  it's based on a hash table.  It's essentially a dict with keys but no values.  An additional complication is that the hash table can be very sparsely filled, in the case of a large set that has had most of its elements deleted---there's no automatic shrinkage of the hash table in that case.  So repeated random selection until you find a filled hash table entry would be inefficient in that case.
msg105160 - (view) Author: Thomas Dybdahl Ahle (thomasahle) * Date: 2010-05-06 19:48
I'm sorry. I see the problem then.
Do you know, if there are any plans of adding a fast balanced binary search tree to pythons stdlib?
msg159699 - (view) Author: Michele Mazzucchi (michelem) Date: 2012-04-30 14:32
Folks, I really think this should be addressed.

Python has beautiful data structure semantics, and this is a stain in them.

An implementation based on the current underlying hash table is quite simple, just pick random addresses until an active key is found. Even on sparse tables this is probabilistic O(1). Even with average load factor = 50%, only 1 extra attempt is needed; 2 with LF as low as 33%.

I'm happy to provide a patch if anyone defines the desired API in Include/setobject.h .
msg159721 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-04-30 18:25
Michele, you might want to raise this on the python-dev or python-ideas mailing list---it'll get better exposure there than in this closed issue.

For completeness, see also the previous discussion in issue 936988.
History
Date User Action Args
2022-04-11 14:56:55adminsetgithub: 51771
2012-04-30 18:25:27mark.dickinsonsetmessages: + msg159721
2012-04-30 14:32:37michelemsetnosy: + michelem
messages: + msg159699
2010-05-06 19:48:40thomasahlesetmessages: + msg105160
2010-05-06 11:40:40mark.dickinsonsetmessages: + msg105139
2010-05-06 11:29:57thomasahlesetnosy: + thomasahle
messages: + msg105138
2009-12-17 22:59:50lleeoosetmessages: + msg96540
2009-12-16 17:57:25rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg96492
2009-12-16 12:50:56mark.dickinsonsetversions: + Python 2.7, Python 3.2, - Python 2.6
nosy: + rhettinger, mark.dickinson

messages: + msg96486

assignee: rhettinger
type: behavior -> enhancement
2009-12-16 07:52:40lleeoocreate