Message 196567 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	madison.may
Recipients	aisaac, madison.may, mark.dickinson, pitrou, rhettinger, serhiy.storchaka, tim.peters
Date	2013-08-30.18:16:29
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1377886590.31.0.127357175614.issue18844@psf.upfronthosting.co.za>
In-reply-to

Content
[Mark Dickinson] > Both those seem like clear error conditions to me, though I think it would be fine if the second condition produced a ZeroDivisionError rather than a ValueError. Yeah, in hindsight it makes sense that both of those conditions should raise errors. After all: "Explicit is better than implicit". As far as optimization goes, could we potentially use functools.lru_cache to cache the cumulative distribution produced by the weights argument and optimize repeated sampling? Without @lru_cache: >>> timeit.timeit("x = choice(list(range(100)), list(range(100)))", setup="from random import choice", number=100000) 36.7109281539997 With @lru_cache(max=128): >>> timeit.timeit("x = choice(list(range(100)), list(range(100)))", setup="from random import choice", number=100000) 6.6788657720007905 Of course it's a contrived example, but you get the idea. Walker's aliasing method looks intriguing. I'll have to give it a closer look. I agree that an efficient implementation would be preferable but would feel out of place in random because of the return type. I still believe a relatively inefficient addition to random.choice would be valuable, though.

[Mark Dickinson]
> Both those seem like clear error conditions to me, though I think it would be fine if the second condition produced a ZeroDivisionError rather than a ValueError.

Yeah, in hindsight it makes sense that both of those conditions should raise errors.  After all: "Explicit is better than implicit".

As far as optimization goes, could we potentially use functools.lru_cache to cache the cumulative distribution produced by the weights argument and optimize repeated sampling? 

Without @lru_cache:
>>> timeit.timeit("x = choice(list(range(100)), list(range(100)))", setup="from random import choice", number=100000)
36.7109281539997

With @lru_cache(max=128):
>>> timeit.timeit("x = choice(list(range(100)), list(range(100)))", setup="from random import choice", number=100000)
6.6788657720007905

Of course it's a contrived example, but you get the idea.

Walker's aliasing method looks intriguing.  I'll have to give it a closer look.  

I agree that an efficient implementation would be preferable but would feel out of place in random because of the return type.  I still believe a relatively inefficient addition to random.choice would be valuable, though.

History
Date	User	Action	Args
2013-08-30 18:16:30	madison.may	set	recipients: + madison.may, tim.peters, rhettinger, mark.dickinson, pitrou, aisaac, serhiy.storchaka
2013-08-30 18:16:30	madison.may	set	messageid: <1377886590.31.0.127357175614.issue18844@psf.upfronthosting.co.za>
2013-08-30 18:16:30	madison.may	link	issue18844 messages
2013-08-30 18:16:29	madison.may	create