This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Document multiprocessing.pool.ThreadPool
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: brandon-rhodes, davin, docs@python, godlygeek, miss-islington, ncoghlan, ned.deily, pablogsal, sbt, srodriguez
Priority: normal Keywords: easy, patch

Created on 2013-02-06 02:08 by ncoghlan, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23812 merged godlygeek, 2020-12-17 02:12
PR 23834 merged miss-islington, 2020-12-18 13:06
PR 23835 merged miss-islington, 2020-12-18 13:06
PR 23836 merged miss-islington, 2020-12-18 13:06
Messages (16)
msg181495 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-02-06 02:08
The multiprocessing module currently provides the "multiprocessing.dummy.ThreadPool" API that exposes the same API as the public multiprocessing.Pool, but is implemented with threads rather than processes. (This is sort of documented - it's existence is implied by the documentation of multiprocessing.dummy, but it doesn't spell out "hey, stdlib ThreadPool implementation!".

Given that this feature is likely useful to many people for parallelising IO bound tasks without migrating to the concurrent.futures API (or where that API doesn't quite fit the use case), it makes sense to make it a more clearly documented feature under a less surprising name.

I haven't looked at the implementation, so I'm not sure how easy it will be to migrate it to a different module, but threading seems like a logical choice given the multiprocessing.ThreadPool vs threading.ThreadPool parallel.

(Honestly, I'd be happier if we moved queue.Queue to threading as well. Having a threading specific data type as a top level module in its own right just makes it harder for people to find for no real reason other than a historical accident)

Alternatively, we could add a "concurrent.pool" module which was little more than:

from multiprocessing import Pool as ProcessPool
from multiprocessing.dummy import ThreadPool
msg189742 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-21 10:50
Given that the change could only be made to 3.4, and we already have concurrent.futures.ThreadPoolExecutor, I am not sure there is much point to such a change now.
msg189744 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-05-21 11:16
Thread Pools can be handy when you want to do explicit message passing, rather than the call-and-response model favoured by the futures module.
msg189745 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-21 11:56
I don't understand what you mean by "explicit message passing" and
"call-and-response model".
msg189746 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-05-21 12:21
Future are explicitly about kicking off a concurrent call and waiting for a reply. They're great for master/slave and client/server models, but not particularly good for actors and other forms of peer-to-peer message passing.

For the latter, explicit pools and message queues are still the way to go, and that's why I think a concurrent.pool module may still be useful as a more obvious entry point for the thread pool implementation.
msg189751 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-21 13:26
As far as I can see they are mostly equivalent.  For instance, ApplyResult (the type returned by Pool.apply_async()) is virtually the same as a Future.

When you say "explicit message passing", do you mean creating a queue and making the worker tasks put results on that queue?  Why can't you do the same with ThreadPoolExecutor?
msg189755 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-05-21 13:43
No, I mean implementing communicating sequential processes with independent state machines passing messages to each other. There aren't necessarily any fixed request/reply pairs. Each actor has a "mailbox", which is a queue that you dump its messages into. If you want a reply, you'll include some kind of addressing info to get the answer back rather than receiving it back on the channel you used to send the message.

For threads, the addressing info can just be a queue.Queue reference for your own mailbox, for multiple processes it can either be multiprocessing queue, or any other form of IPC.

It's a very different architecture from that assumed by futures, so you need to drop down to the pool layer rather than using the executor model.
msg189759 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-21 14:16
> It's a very different architecture from that assumed by futures,
> so you need to drop down to the pool layer rather than using the
 > executor model.

AIUI an ThreadPoolExecutor object (which must be explicitly created) 
represents a thread/process pool, and it is used to send tasks to the 
workers in the pool.  And if you want to ignore the future object 
returned by submit(), then you can.  How is that any different from a 
ThreadPool object?

And if you are impementing actors on top of a thread pool then isn't 
there a limit on the number "active" actors there can be at any one 
time, potentially creating deadlocks because all workers are waiting for 
messages from an actor which cannot run yet.  (I am probably 
misunderstanding what you mean.)

To me, the obvious way to implement actors would be to create one 
thread/process for each actor.  In Python 3.4 one could use the tulip 
equivalents instead for better scalability.
msg189798 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-05-22 08:15
Actors are just as vulnerable to the "new threads/processes are expensive" issue as anything else, and by using a dynamic pool appropriately you can amortise those costs across multiple instances.

The point is to expose a less opinionated threading model in a more readily accessible way. Executors and futures are *very* opinionated about the communication channels you're expected to use (the ones the executor provides), while pools are just a resource management tool.
msg189800 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-22 09:23
I understand that a thread pool (in the general sense) might be used to amortise the cost.  But I think you would probably have to write this from scratch rather than use the ThreadPool API.

The ThreadPool API does not really expose anything that the ThreadPoolExceutor API does not -- the differences are just a matter of taste.
msg225513 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-08-19 03:27
After a question from Brandon Rhodes, I noticed that ThreadPool is actually listed in multiprocess.pool.__all__.

So rather than doing anything more dramatic, we should just document the existing multiprocessing feature.

As Richard says, the concurrent.futures Executor already provides a general purpose thread and process pooling model, and when that isn't appropriate, something like asyncio or gevent may actually be a better fit anyway.
msg383297 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-18 13:06
New changeset 84ebcf271a2cc8bfd1762acb279502b8b6ef236e by Matt Wozniski in branch 'master':
bpo-17140: Document multiprocessing's ThreadPool (GH-23812)
https://github.com/python/cpython/commit/84ebcf271a2cc8bfd1762acb279502b8b6ef236e
msg383298 - (view) Author: miss-islington (miss-islington) Date: 2020-12-18 13:27
New changeset 14619924c36435e356135988c244cbc28652c82b by Miss Islington (bot) in branch '3.9':
bpo-17140: Document multiprocessing's ThreadPool (GH-23812)
https://github.com/python/cpython/commit/14619924c36435e356135988c244cbc28652c82b
msg383315 - (view) Author: miss-islington (miss-islington) Date: 2020-12-18 18:38
New changeset d21d29ab5b8741da056ac09c49c759b6ccbf264a by Miss Islington (bot) in branch '3.8':
[3.8] bpo-17140: Document multiprocessing's ThreadPool (GH-23812) (GH-23835)
https://github.com/python/cpython/commit/d21d29ab5b8741da056ac09c49c759b6ccbf264a
msg383316 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2020-12-18 18:38
New changeset 00278d4e616315e64557bff014574c079e6e96ff by Miss Islington (bot) in branch '3.7':
bpo-17140: Document multiprocessing's ThreadPool (GH-23812) (GH-23836)
https://github.com/python/cpython/commit/00278d4e616315e64557bff014574c079e6e96ff
msg383317 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2020-12-18 18:42
Thanks, Matt, for the documentation PR.
History
Date User Action Args
2022-04-11 14:57:41adminsetgithub: 61342
2020-12-19 01:00:59Tilkasetnosy: - Tilka
2020-12-18 18:42:54ned.deilysetstatus: open -> closed
versions: + Python 3.8, Python 3.9, Python 3.10, - Python 3.5, Python 3.6
messages: + msg383317

resolution: fixed
stage: patch review -> resolved
2020-12-18 18:38:57ned.deilysetnosy: + ned.deily
messages: + msg383316
2020-12-18 18:38:05miss-islingtonsetmessages: + msg383315
2020-12-18 13:27:06miss-islingtonsetmessages: + msg383298
2020-12-18 13:06:25miss-islingtonsetpull_requests: + pull_request22695
2020-12-18 13:06:13miss-islingtonsetpull_requests: + pull_request22694
2020-12-18 13:06:06miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request22693
2020-12-18 13:06:04pablogsalsetnosy: + pablogsal
messages: + msg383297
2020-12-17 02:12:34godlygeeksetkeywords: + patch
nosy: + godlygeek

pull_requests: + pull_request22673
stage: needs patch -> patch review
2017-07-22 22:03:43pitrousetassignee: docs@python

nosy: + docs@python
components: + Documentation, - Library (Lib)
versions: + Python 3.6, Python 3.7, - Python 3.4
2015-02-28 15:42:58davinsetnosy: + davin
2014-08-19 03:27:51ncoghlansetnosy: + brandon-rhodes
versions: + Python 3.5
messages: + msg225513

keywords: + easy
title: Provide a more obvious public ThreadPool API -> Document multiprocessing.pool.ThreadPool
2014-01-14 13:44:20srodriguezsetnosy: + srodriguez
2013-05-22 09:23:10sbtsetmessages: + msg189800
2013-05-22 08:15:10ncoghlansetmessages: + msg189798
2013-05-21 14:16:28sbtsetmessages: + msg189759
2013-05-21 13:43:28ncoghlansetmessages: + msg189755
2013-05-21 13:26:48sbtsetmessages: + msg189751
2013-05-21 12:21:37ncoghlansetmessages: + msg189746
2013-05-21 11:56:39sbtsetmessages: + msg189745
2013-05-21 11:16:06ncoghlansetmessages: + msg189744
2013-05-21 10:50:57sbtsetnosy: + sbt
messages: + msg189742
2013-05-21 07:43:07Tilkasetnosy: + Tilka
2013-02-06 02:08:05ncoghlancreate