Message 288043 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	prayerslayer
Recipients	prayerslayer
Date	2017-02-17.22:10:22
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1487369423.42.0.265838648983.issue29595@psf.upfronthosting.co.za>
In-reply-to

Content
Hi! I think the ThreadPoolExecutor should allow to set the maximum size of the underlying queue. The situation I ran into recently was that I used ThreadPoolExecutor to parallelize AWS API calls; I had to move data from one S3 bucket to another (~150M objects). Contrary to what I expected the maximum size of the underlying queue doesn't have a non-zero value by default. Thus my process ended up consuming gigabytes of memory, because it put more items into the queue than the threads were able to work off: The queue just kept growing. (It ran on K8s and the pod was rightfully killed eventually.) Of course there ways to work around this. One could use more threads, to some extent. Or you could use your own queue with a defined maximum size. But I think it's more work for users of Python than necessary.

Hi!

I think the ThreadPoolExecutor should allow to set the maximum size of the underlying queue.

The situation I ran into recently was that I used ThreadPoolExecutor to parallelize AWS API calls; I had to move data from one S3 bucket to another (~150M objects). Contrary to what I expected the maximum size of the underlying queue doesn't have a non-zero value by default. Thus my process ended up consuming gigabytes of memory, because it put more items into the queue than the threads were able to work off: The queue just kept growing. (It ran on K8s and the pod was rightfully killed eventually.)

Of course there ways to work around this. One could use more threads, to some extent. Or you could use your own queue with a defined maximum size. But I think it's more work for users of Python than necessary.

History
Date	User	Action	Args
2017-02-17 22:10:23	prayerslayer	set	recipients: + prayerslayer
2017-02-17 22:10:23	prayerslayer	set	messageid: <1487369423.42.0.265838648983.issue29595@psf.upfronthosting.co.za>
2017-02-17 22:10:23	prayerslayer	link	issue29595 messages
2017-02-17 22:10:22	prayerslayer	create