classification
Title: multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ericvw, pablogsal, pitrou, tzickel, vstinner
Priority: normal Keywords: patch

Created on 2018-12-02 16:43 by pablogsal, last changed 2018-12-13 00:31 by vstinner.

Pull Requests
URL Status Linked Edit
PR 10852 closed pablogsal, 2018-12-02 20:33
Messages (16)
msg330890 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 16:43
After applying the PRs in issue34172, multiprocessing.Pool.imap hangs on MacOs and Linux. This is a simple reproducer:

import multiprocessing

def the_test():
    print("Begin")
    for x in multiprocessing.Pool().imap(int,
            ["4", "3"]):
        print(x)
    print("End")

the_test()

This happens because the IMapIterator does not maintain alive the multiprocessing.Pool object while it is still alive.
msg330893 - (view) Author: (tzickel) * Date: 2018-12-02 17:45
It's important to note that before those PR, that code would leak the Pool instance until the process ends (once per call).

https://github.com/python/cpython/compare/master...tzickel:fix34172

Is my proposed fix (till I get it to a PR).
msg330895 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 18:32
I was working already on a PR. Do you prefer to wait for yours instead?
msg330896 - (view) Author: (tzickel) * Date: 2018-12-02 18:49
I dont mind, I think my code is ready for review, but I'm not versed in this, so if you think you have something better, feel free to open a PR or tell me if I should submit mine, and you can comment on it:

https://github.com/python/cpython/compare/master...tzickel:fix34172
msg330899 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-02 20:30
I think the code is almost ok (I also prefer to also use the cache as an excuse to maintain the pool alive) but the test needs to be done a bit more carefully to avoid hanging the test suite in case of failure and to avoid leaking threads or processes.
msg331214 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 09:31
Let's step back a bit here.  This kind of code has never been supported.  As Victor says, we should be careful not to add any potential sources of reference cycles.

The reason the code originally "worked" is that it actually leaked the Pool object (and presumably its worker threads and processes) until the end of the application.
msg331224 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-06 11:12
Although I think keeping the iterator is not a bad solution if done correctly, I think more and more that is not the best solution.

@Antoine, would you be ok passing a weak reference to the iterator and raising if the pool is dead?

I still think we should try to avoid hanging.
msg331225 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-06 11:16
In general, I'm fine with adding a *weak* reference, especially if it helps to detect bugs.
msg331226 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 11:17
I think a weakref is fine.  You don't have to *pass* a weakref: just pass a normal ref and let the Result object take a weakref (and a reference to the cache perhaps).
msg331227 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-06 11:17
See https://bugs.python.org/issue34172#msg331216 for a more general discussion about how the multiprocessing API is supposed tobe used and multiprocessing objects lifetime.
msg331228 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-06 11:17
Ok, I will modify my PR to pass a weak reference and raise.
msg331229 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-12-06 11:18
I agree about making the multiprocessing API safer to use, but please let's have this discussion on a dedicated bug entry.
msg331442 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-09 17:41
I am playing with weakreferences inside the iterator objects, but this may not be enough. For example, take the code of ApplyResult.get:

def get(self, timeout=None):
    if self._pool() is None:
        raise RuntimeError("The pool is dead!") <--- new code
    self.wait(timeout)

It can be that the pool is alive when we check for it (self._pool() is None) but while the code is waiting with no timeout, the pool dies, effectively leaving the program deadlocked with no error.
msg331444 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-12-09 17:44
Also, making tests that do not leak threads for this case is very difficult as if we go with weakrefs, the test *has* to leak a pool (the pool is dead but never calls join/close) to check that when you use the iterator the exception happens. Also, getting such test race-free is going to be challenging.
msg331452 - (view) Author: (tzickel) * Date: 2018-12-09 20:20
https://bugs.python.org/issue35267
msg331725 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-13 00:31
See also bpo-35478: multiprocessing: ApplyResult.get() hangs if the pool is terminated.
History
Date User Action Args
2018-12-13 00:31:31vstinnersetmessages: + msg331725
2018-12-09 20:20:45tzickelsetmessages: + msg331452
2018-12-09 17:44:32pablogsalsetmessages: + msg331444
2018-12-09 17:41:37pablogsalsetmessages: + msg331442
2018-12-09 17:40:49pablogsalsetmessages: - msg331441
2018-12-09 17:40:27pablogsalsetmessages: + msg331441
2018-12-06 11:18:27pitrousetmessages: + msg331229
2018-12-06 11:17:48pablogsalsetmessages: + msg331228
2018-12-06 11:17:43vstinnersetmessages: + msg331227
2018-12-06 11:17:38pitrousetmessages: + msg331226
2018-12-06 11:16:24vstinnersetnosy: + vstinner
messages: + msg331225
2018-12-06 11:12:08pablogsalsetmessages: + msg331224
2018-12-06 09:31:39pitrousettype: behavior
messages: + msg331214
versions: - Python 3.7
2018-12-02 20:33:12pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request10087
2018-12-02 20:30:49pablogsalsetmessages: + msg330899
2018-12-02 18:49:55tzickelsetmessages: + msg330896
2018-12-02 18:32:59pablogsalsetmessages: + msg330895
2018-12-02 18:06:47ericvwsetnosy: + ericvw
2018-12-02 17:45:36tzickelsetmessages: + msg330893
2018-12-02 16:43:32pablogsalcreate