This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: davin, ericvw, pablogsal, pitrou, tzickel, vstinner
Priority: normal Keywords: patch

Created on 2018-12-02 16:43 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10852 closed pablogsal, 2018-12-02 20:33
PR 11627 merged pablogsal, 2019-01-21 02:13
PR 11627 merged pablogsal, 2019-01-21 02:13
PR 11627 merged pablogsal, 2019-01-21 02:13
Messages (22)
msg330890 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 16:43
After applying the PRs in issue34172, multiprocessing.Pool.imap hangs on MacOs and Linux. This is a simple reproducer:

import multiprocessing

def the_test():
    print("Begin")
    for x in multiprocessing.Pool().imap(int,
            ["4", "3"]):
        print(x)
    print("End")

the_test()

This happens because the IMapIterator does not maintain alive the multiprocessing.Pool object while it is still alive.
msg330893 - (view) Author: (tzickel) * Date: 2018-12-02 17:45
It's important to note that before those PR, that code would leak the Pool instance until the process ends (once per call).

https://github.com/python/cpython/compare/master...tzickel:fix34172

Is my proposed fix (till I get it to a PR).
msg330895 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 18:32
I was working already on a PR. Do you prefer to wait for yours instead?
msg330896 - (view) Author: (tzickel) * Date: 2018-12-02 18:49
I dont mind, I think my code is ready for review, but I'm not versed in this, so if you think you have something better, feel free to open a PR or tell me if I should submit mine, and you can comment on it:

https://github.com/python/cpython/compare/master...tzickel:fix34172
msg330899 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 20:30
I think the code is almost ok (I also prefer to also use the cache as an excuse to maintain the pool alive) but the test needs to be done a bit more carefully to avoid hanging the test suite in case of failure and to avoid leaking threads or processes.
msg331214 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 09:31
Let's step back a bit here.  This kind of code has never been supported.  As Victor says, we should be careful not to add any potential sources of reference cycles.

The reason the code originally "worked" is that it actually leaked the Pool object (and presumably its worker threads and processes) until the end of the application.
msg331224 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-06 11:12
Although I think keeping the iterator is not a bad solution if done correctly, I think more and more that is not the best solution.

@Antoine, would you be ok passing a weak reference to the iterator and raising if the pool is dead?

I still think we should try to avoid hanging.
msg331225 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-06 11:16
In general, I'm fine with adding a *weak* reference, especially if it helps to detect bugs.
msg331226 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 11:17
I think a weakref is fine.  You don't have to *pass* a weakref: just pass a normal ref and let the Result object take a weakref (and a reference to the cache perhaps).
msg331227 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-06 11:17
See https://bugs.python.org/issue34172#msg331216 for a more general discussion about how the multiprocessing API is supposed tobe used and multiprocessing objects lifetime.
msg331228 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-06 11:17
Ok, I will modify my PR to pass a weak reference and raise.
msg331229 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 11:18
I agree about making the multiprocessing API safer to use, but please let's have this discussion on a dedicated bug entry.
msg331442 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-09 17:41
I am playing with weakreferences inside the iterator objects, but this may not be enough. For example, take the code of ApplyResult.get:

def get(self, timeout=None):
    if self._pool() is None:
        raise RuntimeError("The pool is dead!") <--- new code
    self.wait(timeout)

It can be that the pool is alive when we check for it (self._pool() is None) but while the code is waiting with no timeout, the pool dies, effectively leaving the program deadlocked with no error.
msg331444 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-09 17:44
Also, making tests that do not leak threads for this case is very difficult as if we go with weakrefs, the test *has* to leak a pool (the pool is dead but never calls join/close) to check that when you use the iterator the exception happens. Also, getting such test race-free is going to be challenging.
msg331452 - (view) Author: (tzickel) * Date: 2018-12-09 20:20
https://bugs.python.org/issue35267
msg331725 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-13 00:31
See also bpo-35478: multiprocessing: ApplyResult.get() hangs if the pool is terminated.
msg333012 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-01-04 22:38
bpo-35629 has been marked as a duplicate of this issue.
msg333573 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-01-13 22:04
I have been playing with possible solutions for a while and the weak-reference solution seems not robust enough as there are too potential race conditions between the destruction of the weakreferences (the pool) and the handling code.

I would advocate again for using a strong reference. The reasons are:

* The rest of the stdlib is using this solution to link iterator objects and similar to their parents (lists, dicts, sets...etc all have strong references back to their parents).

* This solution is backwards compatible with the current behaviour.

* We have the new ResourceWarnigns to make clear that this behaviour is not supported.
msg333582 - (view) Author: (tzickel) * Date: 2019-01-14 05:19
+1
msg333594 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-01-14 09:16
According to all previous discussions, I now agree with having a strong reference.
msg335242 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-11 17:29
New changeset 3766f18c524c57784eea7c0001602017d2122156 by Pablo Galindo in branch 'master':
bpo-35378: Fix multiprocessing.Pool references (GH-11627)
https://github.com/python/cpython/commit/3766f18c524c57784eea7c0001602017d2122156
msg335358 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-12 19:39
Thanks Pablo for the fix ;-)
History
Date User Action Args
2022-04-11 14:59:08adminsetgithub: 79559
2019-02-12 19:39:32vstinnersetmessages: + msg335358
2019-02-11 17:47:40pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-02-11 17:29:06pablogsalsetmessages: + msg335242
2019-01-21 02:14:07pablogsalsetpull_requests: + pull_request11394
2019-01-21 02:13:57pablogsalsetpull_requests: + pull_request11393
2019-01-21 02:13:46pablogsalsetpull_requests: + pull_request11392
2019-01-14 09:16:53vstinnersetmessages: + msg333594
2019-01-14 05:34:17rhettingersetnosy: + davin
2019-01-14 05:19:11tzickelsetmessages: + msg333582
2019-01-13 22:04:00pablogsalsetmessages: + msg333573
2019-01-04 22:38:23vstinnersetmessages: + msg333012
2019-01-04 19:44:32pitroulinkissue35629 superseder
2018-12-13 00:31:31vstinnersetmessages: + msg331725
2018-12-09 20:20:45tzickelsetmessages: + msg331452
2018-12-09 17:44:32pablogsalsetmessages: + msg331444
2018-12-09 17:41:37pablogsalsetmessages: + msg331442
2018-12-09 17:40:49pablogsalsetmessages: - msg331441
2018-12-09 17:40:27pablogsalsetmessages: + msg331441
2018-12-06 11:18:27pitrousetmessages: + msg331229
2018-12-06 11:17:48pablogsalsetmessages: + msg331228
2018-12-06 11:17:43vstinnersetmessages: + msg331227
2018-12-06 11:17:38pitrousetmessages: + msg331226
2018-12-06 11:16:24vstinnersetnosy: + vstinner
messages: + msg331225
2018-12-06 11:12:08pablogsalsetmessages: + msg331224
2018-12-06 09:31:39pitrousettype: behavior
messages: + msg331214
versions: - Python 3.7
2018-12-02 20:33:12pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request10087
2018-12-02 20:30:49pablogsalsetmessages: + msg330899
2018-12-02 18:49:55tzickelsetmessages: + msg330896
2018-12-02 18:32:59pablogsalsetmessages: + msg330895
2018-12-02 18:06:47ericvwsetnosy: + ericvw
2018-12-02 17:45:36tzickelsetmessages: + msg330893
2018-12-02 16:43:32pablogsalcreate