This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [RFE] Add asyncio.background_call API
Type: enhancement Stage: needs patch
Components: asyncio Versions: Python 3.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: giampaolo.rodola, gvanrossum, ncoghlan, pitrou, r.david.murray, srkunze, vstinner, yselivanov
Priority: normal Keywords:

Created on 2015-07-06 01:36 by ncoghlan, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (22)
msg246342 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-06 01:36
Based on a current python-dev discussion, I'd like to suggest a high level convenience API in asyncio to dispatch a blocking call out to a separate thread or process:

    # Call blocking operation from asynchronous code
   def blocking_call(f, *args, **kwds):
        """Usage: result = await asyncio.blocking_call(f, *args, **kwds))"""
        cb = functools.partial(f, *args, **kwds)
        return asyncio.get_event_loop().run_in_executor(cb)

While that function is only a couple of lines long, it's *conceptually* very dense.

The aim would thus be to let folks safely make blocking calls from asyncio code without needing to first understand the intricacies of the event loop, the event loop's executor, or the need to wrap the call in functools.partial. Exploring those would instead become an optional exercise in understanding how asyncio.blocking_call works.
msg246374 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-06 18:05
Thanks for taking the initiative here, Nick. I created a follow-up on this: http://bugs.python.org/issue24578

In order to bridge both worlds, projects might need convenient way from and to either world (classic and asyncio).
msg246375 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-06 18:13
2 remarks:

1) I would rather go for a more comprehensible name such as 'get_awaitable' instead of 'blocking_call'. Later reminds me of the execution of f which is not the case.

2) redundant ) in the end of """Usage: result = await asyncio.blocking_call(f, *args, **kwds))"""
msg246404 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-07 12:23
The concerns I have with "get_awaitable" are:

* it doesn't express user intent - the user doesn't care about getting an awaitable, they want to initiate a blocking call without holding up the current coroutine
* it's too broad - there are many other ways to get an awaitable, while this is specifically about being able to schedule a blocking call in another thread or process

If "blocking_call" reminds you of the execution of f, that's a good thing: this call immediately dispatches f for execution in another thread or process, and returns a future that lets you wait for the result later.
msg246405 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-07 12:31
After trying out some example code in issue 24578, I've changed the suggestion function name to "call_async". The reason is because it makes this kind of code read quite well:

    futureB = asyncio.call_async(slow_io_bound_operation)
    futureC = asyncio.call_async(another_slow_io_bound_operation)
    a = calculateA()
    b = asyncio.wait_for_result(futureB)
    c = asyncio.wait_for_result(futureC)

And still reads well when combined with await:

    b = await asyncio.call_async(blocking_operation)
msg246407 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-07 12:53
You seem to miss that run_in_executor() does take *args -- so the partial() call is only needed if you need to pass keyword args. Is it really worth having a helper for this one-liner?

def call_async(func, *args):
    return asyncio.get_event_loop().run_in_executor(func, *arg)

I'm on the fence myself. I do like the new name better.
msg246408 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-07 13:01
FWIW:

> The concerns I have with "get_awaitable" are [snip]

I agree with you.

> I've changed the suggestion function name to "call_async"

I disagree. "async" is an extremely overloaded term with no unambiguous meaning (but possible misinterpretations), especially now that Python 3.5 has an "async" keyword (or quasi-keyword? I don't remember :-)).

"call_in_executor" I think was fine. "call_in_thread" would be as well (and would mimick Twisted's own callInThread).
msg246409 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-07 13:03
Oh, and yes, it's not obvious this is needed at all :-)
msg246411 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-07 13:07
*If* it needs to be added I like call_in_thread(). (We should have used that instead of call_in_executor() on the loop, but too late now.)
msg246414 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-07 13:29
As a minor note, we didn't use call_in_thread on the event loop because it may be "call_in_process". I've also merged issue 24578 back into this one as Guido suggested.

I think the example Sven gave on the mailing list provides a better rationale here than my original one of safely making blocking calls from an asyncio coroutine (as I agree if you're to the point of writing your own coroutines, dispatching a blocking call to the executor shouldn't be a big deal).

Sven's scenario instead relates to the case of having an application which is still primarily synchronous, but has the occasional operation where it would be convenient to be able to factor them out as parallel operations without needing to significantly refactor the code. For example:

    def load_and_process_data():
        data1 = load_remote_data_set1()
        data2 = load_remote_data_set2()
        return process_data(data1, data2)

Trying yet another colour for the bikeshed (before I change the issue title again), imagine if that could be written:

    def load_and_process_data():
        future1 = asyncio.background_call(load_remote_data_set1)
        future2 = asyncio.background_call(load_remote_data_set2)
        data1 = asyncio.wait_for_result(future1)
        data2 = asyncio.wait_for_result(future2)
        return process_data(data1, data2)

The operating model of interest here would be the one where you're still writing a primarily synchronous (and perhaps even single-threaded) application (rather than an event-driven one), but you'd like to occasionally do some background processing while the foreground thread continues with other tasks.
msg246416 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-07 15:14
But that example also shows what's wrong with the idea. I presume load_remote_data_set1 and load_remote_data_set2 are themselves just using synchronous I/O, and the parallelization is done using threads. So why not use concurrent.futures? Why bother with asyncio at all?
msg246420 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-07-07 15:36
> *If* it needs to be added I like call_in_thread().

Except if you explicitly set the executor to a thread pool executor, the function name can be a lie. It's possible to modify the default executor to a process poll executor.

I vote -1 for the function. I would prefer to use directly event loop methods (run_in_executor).

   # Call blocking operation from asynchronous code
   def blocking_call(f, *args, **kwds):
    ...

This function name is very misleading. Calling blocking_call() doesn't return the result of f(*args, **kwds) but a Future object...

> The aim would thus be to let folks safely make blocking calls from asyncio code without needing to first understand the intricacies of the event loop, the event loop's executor, or the need to wrap the call in functools.partial.

I don't like the idea of hiding asyncio complexity. See eventlet monkey patching, it was a big mistake.

But ok to add helpers when they are revelant.

I also fear adding too many functions to do the same things.

For example, scheduling the execution of a coroutine can now be done by:

* asyncio.async(coro)
* asyncio.Task(coro)
* loop.create_task(coro)
* asyncio.ensure_task(coro)
msg246431 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-07 17:28
> Why bother with asyncio at all?

Good question. My initial reaction to async+await was: 'great, finally a Pythonic (i.e. a single, explicit) way to do squeeze out more of our servers'. Moreover, the goal of 'being more like classic code' + 'having reasonable tracebacks' reads like 'nice, ready for production code'.

After reading the documentation, I was slightly confused by: coroutines, tasks, futures which somehow feel very similar. But because of the introduction of 'await', I thought, we would not need to bother with that at all.


Then, people started to tell me that asyncio and normal execution could never interact with each other.


I cannot believe that. Python always gave me convenient tools. I cannot believe it should be different this time.


I can use properties the same way as attributes, i.e. I can substitute them with each other seamlessly. I can call class methods the same way as instance methods, i.e. I can substitute them with each other seamlessly. I can call functions that raise exceptions the same way as functions that raise no exceptions, i.e. I can substitute them with each other seamlessly.

It is just perfect. Comparing the projects I am involved, those using Python proceed at a much greater pace just because of these convenient tools.



So, I wanted to leverage asyncio's power without touching million lines of code as well. If asyncio is not the right tool, that is fine with me, but then I do not get why threading does not have its own syntax (and goals described above) whereas asyncio does. To me, I could accomplish the same more or less with both tools.
msg246432 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-07 17:32
> I also fear adding too many functions to do the same things.
> 
> For example, scheduling the execution of a coroutine can now be done by:

> * asyncio.async(coro)
> * asyncio.Task(coro)
> * loop.create_task(coro)
> * asyncio.ensure_task(coro)

If you ask me, that does not look very Pythonic to me.
msg246434 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-07-07 17:39
"finally a Pythonic (i.e. a single, explicit) way to do squeeze out more of our servers'"

I think that you don't understand the purpose of asyncio, then, since squeezing more out of servers is, to my understanding, neither a goal nor something it does (except in the mostly trivial sense that you use fewer threads).

What it does is to make writing correct multitasking code easier.  If you aren't writing non-trivial multitasking code, it doesn't buy you anything.
msg246435 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-07 17:59
@David
What is the purpose of multitasking code?
msg246480 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-09 08:56
The problems with using concurrent.futures directly for running synchronous tasks in the background are:

1. You have to manage the lifecycle of the executor yourself, rather than letting asyncio do it for you
2. There's no easy process wide way to modify the size of the background task thread pool (or switch to using processes instead)
3. There's no easy way for background tasks themselves to use asynchronous IO

With the switch to "background_call" as the name, I'd modify the implementation to detect coroutines and schedule them as tasks rather than running them in the executor.

However, I think it's clear that the idea and its potential benefits are sufficiently unclear that making the case effectively may require a PEP. That's probably worth doing anyway in order to thrash out more precise semantics.
msg246486 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-09 12:36
> 1. You have to manage the lifecycle of the executor yourself, rather than letting asyncio do it for you
> 2. There's no easy process wide way to modify the size of the background task thread pool (or switch to using processes instead)

But if that's what you want, adding a helper or helpers to concurrent.futures makes more sense than adding it to asyncio, which is primarily about using an event loop, *not* threads.

> 3. There's no easy way for background tasks themselves to use asynchronous IO

But how does your proposal help for that? The function passed to background_call() is in no way enabled to do async I/O -- it has no event loop and it is not a coroutine, and it's running in a separate thread.

> With the switch to "background_call" as the name, I'd modify the implementation to detect coroutines and schedule them as tasks rather than running them in the executor.

Honestly, I think that convenience routines that fuzz the difference between synchronous functions (to be run in a thread) and coroutines don't do anyone a service -- an API should educate its users about proper use and the right concepts, and this sounds like it is encouraging staying ignorant.

> However, I think it's clear that the idea and its potential benefits are sufficiently unclear that making the case effectively may require a PEP. That's probably worth doing anyway in order to thrash out more precise semantics.

Or you could just give up. Honestly, I am liking this less and less the more you defend it. That's a classic sign that you should give up. :-)
msg246490 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-09 13:21
I'll at least write a new python-ideas post, as I realised my original idea *is* wrong (and you're right not to like it). The focus needs to be on Sven's original question (How do you kick off a coroutine from otherwise synchronous code, and then later wait for the result?), and then asking whether or not it might make sense to provide a convenience API for such an interface between the worlds of imperative programming and event-driven programming.

Sven's far from the only one confused by that particular boundary, so even if a convenience API doesn't make it into the module itself, an example in the docs explaining how to implement it may be helpful.
msg246497 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-09 13:48
Yeah, we should strongly consider writing more documentation before adding more convenience APIs. Esp. tutorial-style docs, which neither Victor nor I can supply because we've already moved beyond wizard level ourselves so it's hard for us to imagine the beginner's perspective. :-(
msg246518 - (view) Author: Sven R. Kunze (srkunze) Date: 2015-07-09 18:51
> ... this sounds like it is encouraging staying ignorant.

True. However, I being ignorant about the complexity eventually led to the development of high-level languages like Python. Each time, a next generation simply asks the question: 'does it really need to be that complicated?' And each time, there is a solution. We will get there.

I have to admit, I do not stay ignorant because of convenience APIs but because I feel things made overly complicated.


I am not sure if we talk about asyncio anymore. I would say everything in Python regarding concurrency/parallelism needs to be put into perspective:

Modules I know MIGHT be interesting for me:
 - concurrent
 - threading
 - asyncio
 - multiprocessing

But I have no idea why/when to use which one.

AND more importantly, statements like "This class is almost compatible with concurrent.futures.Future." (https://docs.python.org/3/library/asyncio-task.html#asyncio.Future) do not help. If it they are that compatible, why do we need both and when do I need which one? Or is this just another internal detail of implementation I can really be ignorant of?


From what I can tell right now (I read deeper into the topic, but always correct me if I am wrong), my perspective of the modules are now:


API of your application
       ^
       ^
1) either asynchronous/event loop/asyncio
2) or     synchronous/single event = start of program
       ^
       ^
the logic of the application
       ^
       ^
usage of other components
       ^
       ^
1) either 1 thread/imperative/line by line
2) or multithread/concurrent/parallel
3) or multiprocess/concurrent/parallel



My understanding is that asyncio is a way to implement the API of your application whereas concurrent/threading/multiprocessing provide means for more efficient execution of the underlying logic.


However, that cannot be entirely true, as I have already seen modules using asyncio to communication asynchronously with databases (to be honest that is what its name suggests async IO).

So, what? Seems like we can use asyncio also for communication with other components as well which my intuition held true as well. That is why I have trouble to understand why it is considered wrong to do the same with 'normal' functions (to me they are just other components).


AND it can also be the other way round: using concurrent/threading/multiprocessing for implementing the API of your application.
msg246521 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-07-09 19:59
Please move the philosophical discussion to python-ideas.

Regarding the phrasing about the two Future classes being almost compatible, that is unfortunate wording. Two things can have a similar API (merely having the same methods etc.) without being compatible (not having the same behavior or semantics).
History
Date User Action Args
2022-04-11 14:58:18adminsetgithub: 68759
2015-07-09 19:59:37gvanrossumsetstatus: open -> closed
resolution: wont fix
messages: + msg246521
2015-07-09 18:51:28srkunzesetmessages: + msg246518
2015-07-09 13:48:17gvanrossumsetmessages: + msg246497
2015-07-09 13:21:23ncoghlansetmessages: + msg246490
2015-07-09 12:36:36gvanrossumsetmessages: + msg246486
2015-07-09 08:57:01ncoghlansettitle: [RFE] Add asyncio.call_async API -> [RFE] Add asyncio.background_call API
messages: + msg246480
versions: - Python 3.5
2015-07-07 17:59:15srkunzesetmessages: + msg246435
2015-07-07 17:39:12r.david.murraysetnosy: + r.david.murray
messages: + msg246434
2015-07-07 17:32:48srkunzesetmessages: + msg246432
2015-07-07 17:28:56srkunzesetmessages: + msg246431
2015-07-07 15:36:17vstinnersetmessages: + msg246420
2015-07-07 15:14:52gvanrossumsetmessages: + msg246416
2015-07-07 13:29:24ncoghlansetmessages: + msg246414
2015-07-07 13:13:10ncoghlanlinkissue24578 superseder
2015-07-07 13:07:06gvanrossumsetmessages: + msg246411
2015-07-07 13:03:27pitrousetmessages: + msg246409
2015-07-07 13:01:50pitrousetmessages: + msg246408
2015-07-07 12:53:40gvanrossumsetmessages: + msg246407
2015-07-07 12:31:42ncoghlansetmessages: + msg246405
title: [RFE] Add asyncio.call_in_executor API -> [RFE] Add asyncio.call_async API
2015-07-07 12:23:59ncoghlansetmessages: + msg246404
title: [RFE] Add asyncio.blocking_call API -> [RFE] Add asyncio.call_in_executor API
2015-07-06 18:13:09srkunzesetmessages: + msg246375
2015-07-06 18:05:33srkunzesetnosy: + srkunze
messages: + msg246374
components: + asyncio
2015-07-06 01:36:34ncoghlancreate