This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients kousu, rhettinger, tim.peters
Date 2020-03-31.23:58:43
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1585699123.56.0.555009142308.issue40110@roundup.psfhosted.org>
In-reply-to
Content
"Lazy" has several possible aspects, of which Pool.imap() satisfies some:

- Its iterable argument is materialized one object at a time.

- It delivers results one at a time.

So, for example, if `worker` is a function that takes a single int, then

    pool = multiprocessing.Pool(4)
    for i in pool.imap(worker, itertools.count(1)):
        print(i)

works fine, despite that the iterable argument, and the result sequence, are "infinite".

You seem to have something more severe in mind, more along the lines of that the iterable isn't advanced unless it absolutely _needs_ to be advanced in order to deliver a result that's being demanded.  That's how, e.g., the builtin Python 3 `map()` works.

But if the iterable isn't advanced until the main program _demands_ the next result from imap(), then the main program blocks until the machinery peels off the next object from the iterable, picks a worker to send it to, sends it, waits for the worker to deliver the result back on an internal queue, then delivers the result to the main program.  There's no parallelism then.

The way things are now, imap() consumes the iterable as quickly as possible, keeping all workers as busy as possible, regardless of how quickly (or even whether) results are demanded.  And seems to me that's overwhelmingly what people using multiprocessing would want.  In any case, that's what they _have_, so that couldn't be changed lightly (if it all).

Perhaps it would be more profitable to think about ways to implement your pipelines using other primitives?  For example, the first thing I'd try for an N-stage pipeline is a chain of N processes (not in a Pool) connected one to the next by queues.  If for some reason I was determined not to let any process "get ahead", easy - specify a max size of 1 for the queues.  map-like facilities are inherently SIMD style, but pipelines typically have very different code in different stages.
History
Date User Action Args
2020-03-31 23:58:43tim.peterssetrecipients: + tim.peters, rhettinger, kousu
2020-03-31 23:58:43tim.peterssetmessageid: <1585699123.56.0.555009142308.issue40110@roundup.psfhosted.org>
2020-03-31 23:58:43tim.peterslinkissue40110 messages
2020-03-31 23:58:43tim.peterscreate