Message 243296 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	bquinlan, cool-RR, ethan.furman, jnoller, ncoghlan, paul.moore, pitrou, sbt
Date	2015-05-16.06:13:36
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1431756816.78.0.442594049045.issue24195@psf.upfronthosting.co.za>
In-reply-to

Content
filter() usage has always been less common than map() usage, and we see a similar pattern in comprehension usage as well (i.e. [f(x) for x in y] is a far more common construct than [x for x in p(y)]). That "less common" status doesn't keep us from providing filter() as builtin, or syntactic support for filtering in the comprehension syntax. As a result, the main question I'd like to see a clear and authoritative answer to is "Given 'seq2 = filter(p, seq)' or 'seq2 = [x for seq if p(x)]', what's the concurrent.futures based parallel execution syntax in cases where the filtering key is expensive to calculate?" I'd be quite OK with Brian's 2-line implementation going into the concurrent.futures documentation as a filtering recipe, similar to the way Raymond uses the recipes in the itertools documentation to help minimise complexity growth in the core API. I don't mind if Brian's judgement is that it doesn't rise to the level of being worth including as a core feature in its own right, as I agree that the typical case of filtering functions is that they're fast, and when they're not, it's often a sign that data model denormalisation may be desirable in order to cache the relevant derived property.

filter() usage has always been less common than map() usage, and we see a similar pattern in comprehension usage as well (i.e. [f(x) for x in y] is a far more common construct than [x for x in p(y)]). That "less common" status doesn't keep us from providing filter() as builtin, or syntactic support for filtering in the comprehension syntax.

As a result, the main question I'd like to see a clear and authoritative answer to is "Given 'seq2 = filter(p, seq)' or 'seq2 = [x for seq if p(x)]', what's the concurrent.futures based parallel execution syntax in cases where the filtering key is expensive to calculate?"

I'd be quite OK with Brian's 2-line implementation going into the concurrent.futures documentation as a filtering recipe, similar to the way Raymond uses the recipes in the itertools documentation to help minimise complexity growth in the core API.

I *don't* mind if Brian's judgement is that it doesn't rise to the level of being worth including as a core feature in its own right, as I agree that the typical case of filtering functions is that they're fast, and when they're not, it's often a sign that data model denormalisation may be desirable in order to cache the relevant derived property.

History
Date	User	Action	Args
2015-05-16 06:13:36	ncoghlan	set	recipients: + ncoghlan, paul.moore, bquinlan, pitrou, jnoller, cool-RR, ethan.furman, sbt
2015-05-16 06:13:36	ncoghlan	set	messageid: <1431756816.78.0.442594049045.issue24195@psf.upfronthosting.co.za>
2015-05-16 06:13:36	ncoghlan	link	issue24195 messages
2015-05-16 06:13:36	ncoghlan	create