This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eric.snow
Recipients Ben.Darnell, aeros, eric.snow, pitrou, sa, vstinner
Date 2022-02-03.21:27:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1643923644.64.0.270277458291.issue41962@roundup.psfhosted.org>
In-reply-to
Content
FWIW, here's a brain dump about ThreadPoolExecutor and its atexit handler after having looked at the code.

----

First, the relationship between the objects involved:

* work item -> Future
* work item -> task (e.g. function)
* queue -> [work item]
* worker -> executor
* worker -> queue
* worker -> currently running work item
* thread -> worker
* ThreadPoolExecutor -> [thread]
* ThreadPoolExecutor -> queue
* global state -> {thread: queue}

Observations:

* a work item's future and task are not aware of each other, and operations on either have no effect on the other

----

Next, let's look at the relevant ways the objects can be used:

* publicly
   * ThreadPoolExecutor.submit(task) -> Future
   * ThreadPoolExecutor.shutdown()
   * Future.result() and Future.exception()
   * Future.cancel()
   * Future.add_done_callback()
* internally
   * queue.pop() -> work item
   * <work item>.run()
   * thread.join()
   * Future.set_running_or_notify_cancel()
   * Future.set_result() and Future.set_exception()

Observations:

* nothing interacts with a worker directly; it is waited on via its thread and it receives work (or None) via the queue it was given
* once a worker pops the next work item off the queue, nothing else has access to that work item (the original ThreadPoolExecutor().submit() caller has the Future, but that's it)
* there is no cancelation mechanism for tasks
* there is no simple way to interact with the queued tasks
* realistically, there is no way to interact with the currently running task
* there is no reliable way to "kill" a thread

----

Regarding how tasks run:

* ThreadPoolExecutor.submit(task) -> Future
* ThreadPoolExecutor.submit(task) -> work item (Future, task) -> queue
* ThreadPoolExecutor.submit(task) -> thread (worker)
* thread -> worker -> ( queue -> work item -> task )

Observations::

* the worker loop exits if the next item in the queue is None (and the executor is shutting down)

----

Now lets look more closely at the atexit handler.

* as you noted, since 3.9 it is registered with threading._register_atexit() instead of atexit.register()
* the threading atexit handlers run before the regular atexit handlers
* the ThreadPoolExecutor handler does not actually interact with ThreadPoolExecutor instances directly
* it only deals with a module-global list of (thread, [work item]) pairs, to which ThreadPoolExecutor instances add items as they go

The handler does the following:

1. disables ThreadPoolExecutor.submit() (for all instances)
2. (indirectly) tells each worker to stop ASAP
3. lets every pending task run (and lets every running task keep running)
4. waits until all tasks have finished

It does not:

* call any ThreadPoolExecutor.shutdown()
* otherwise deal with the ThreadPoolExecutor instances directly
* call Future.cancel() for any of the tasks
* use any timeout in step 4, so it may block forever
* notify tasks that they should finish
* deal well with any long-running (or infinite) task

ThreadPoolExecutor.shutdown() basically does the same thing.  However, it only does the steps above for its own tasks.  It also optionally calls Future.cancel() for each queued task (right before step 2).  However, all that does is keep any queued-but-not-running tasks from starting.  Also, you can optionally skips step 4.
History
Date User Action Args
2022-02-03 21:27:24eric.snowsetrecipients: + eric.snow, pitrou, vstinner, sa, Ben.Darnell, aeros
2022-02-03 21:27:24eric.snowsetmessageid: <1643923644.64.0.270277458291.issue41962@roundup.psfhosted.org>
2022-02-03 21:27:24eric.snowlinkissue41962 messages
2022-02-03 21:27:24eric.snowcreate