Author Klamann
Recipients Klamann
Date 2017-05-09.21:37:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1494365822.25.0.431593996842.issue30323@psf.upfronthosting.co.za>
In-reply-to
Content
The Executor's map() function accepts a function and an iterable that holds the function arguments for each call to the function that should be made. This iterable could be a generator, and as such it could reference data that won't fit into memory.

The behaviour I would expect is that the Executor requests the next element from the iterable whenever a thread, process or whatever is ready to make the next function call.

But what actually happens is that the entire iterable gets converted into a list right after the map function is called and therefore any underlying generator will load all referenced data into memory. Here's where the list gets built from the iterable:
https://github.com/python/cpython/blob/3.6/Lib/concurrent/futures/_base.py#L548

The way I see it, there's no reason to convert the iterable to a list in the map function (or any other place in the Executor). Just replacing the list comprehension with a generator expression would probably fix that.


Here's an example that illustrates the issue:

    from concurrent.futures import ThreadPoolExecutor
    import time
    
    def generate():
        for i in range(10):
            print("generating input", i)
            yield i
    
    def work(i):
        print("working on input", i)
        time.sleep(1)
    
    with ThreadPoolExecutor(max_workers=2) as executor:
        generator = generate()
        executor.map(work, generator)

The output is:

    generating input 0
    working on input 0
    generating input 1
    working on input 1
    generating input 2
    generating input 3
    generating input 4
    generating input 5
    generating input 6
    generating input 7
    generating input 8
    generating input 9
    working on input 2
    working on input 3
    working on input 4
    working on input 5
    working on input 6
    working on input 7
    working on input 8
    working on input 9

Ideally, the lines should alternate, but currently all input is generated immediately.
History
Date User Action Args
2017-05-09 21:37:02Klamannsetrecipients: + Klamann
2017-05-09 21:37:02Klamannsetmessageid: <1494365822.25.0.431593996842.issue30323@psf.upfronthosting.co.za>
2017-05-09 21:37:02Klamannlinkissue30323 messages
2017-05-09 21:37:02Klamanncreate