Message 355804 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	aeros
Recipients	aeros, asvetlov, primal, yselivanov
Date	2019-11-01.11:09:01
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1572606541.89.0.153583932708.issue32309@roundup.psfhosted.org>
In-reply-to

Content
> I don't think changing the default executor is a good approach. What happens, if two or more thread pools are running at the same time? In that case they will use the same default executor anyway, so creating a new executor each time seems like a waste. I agree that it would be better to have ThreadPool use an internal executor rather than relying on the event loop's default executor. The main reason I hadn't was because we hadn't implemented an asynchronous executor shutdown outside of loop.shutdown_default_executor(), but we could potentially move the functionality to a private function (in Lib/asyncio/base_events.py) so it's reusable for ThreadPool. It could be something like this: async def _shutdown_executor(executor, loop): future = loop.create_future() thread = threading.Thread(target=loop._do_shutdown, args=(executor,future)) thread.start() try: await future finally: thread.join() def _do_shutdown(loop, executor, future): try: executor.shutdown(wait=True) loop.call_soon_threadsafe(future.set_result, None) except Exception as ex: loop.call_soon_threadsafe(future.set_exception, ex) Note that the above would be for internal use only, for the existing loop.shutdown_default_executor() and the new asyncio.ThreadPool. For it to support both, it would have to accept an explicit loop argument. It also does not need a robust API, since it's private. > Shutting down the default executor seems unnecessary and could impact lower level code which is using it. The default executor is shutdown at the end of asyncio.run anyway. I agree with your point regarding the shutdown of the default executor one. But I think we should shutdown the internal executor for the ThreadPool, as a main point context managers is to start and clean up their own resources. Also, I'm aware that asyncio.run() shuts down the default executor, I implemented that fairly recently in GH-15735. ;) Another substantial concern is in the case of a coroutine that contains asyncio.ThreadPool being executed without asyncio.run(). There are still use cases for using the lower level loop.run_until_complete() for more complex asyncio programs. I don't think we should make asyncio.ThreadPool dependent on the coroutine being executed with asyncio.run(). Thus, it makes sense that ThreadPool should create a new instance of ThreadPoolExecutor upon entry of the context manager and then shut it down upon exit. > I'm not sure if a new ThreadPoolExecutor really needs to be created for each ThreadPool though. IMO, a context manager should create and then finalize it's own resources, rather than sharing the same executor across contexts. Sharing the same one seems to defeat the purpose of using a context manager in the first place, no? > Run method should be: async def run(self, func, args, kwargs): call = functools.partial(func, args, **kwargs) return await self._loop.run_in_executor(None, call) Correction: if we're using an internal executor now, this should instead be self._loop.run_in_executor(self._executor, call). With `None`, it will simply use the event loop's default executor rather the context manager's ThreadPoolExecutor. > `run` should be awaitable method, see #38430 Agreed, good point. > Paul's version looks better. I think he had some good points, particularly around using an internal executor instead the event loop's default executor; but there's some parts that I disagree with, see above reasons. > 1. get_running_loop() should be used instead of get_event_loop() Note: If get_running_loop() is used instead, it has to set self._loop within a coroutine (since get_running_loop() can only be used within coroutines), that's why in my version it was within __aenter__. I think this would make the most sense. > 2. There is no `await executer.shutdown()` API, the method is synchronous. That's why in my version I was using the existing event loop API, since we had already implemented an asynchronous loop.shutdown_default_executor() method that calls executor.shutdown(). However, if we added a private _shutdown_executor() and _do_shutdown() as I mentioned above, that wouldn't be an issue. Thanks for the feedback on the prototype Paul and Andrew, both of you brought up some good points. I'll start working on a PR.

> I don't think changing the default executor is a good approach. What happens, if two or more thread pools are running at the same time? In that case they will use the same default executor anyway, so creating a new executor each time seems like a waste. 

I agree that it would be better to have ThreadPool use an internal executor rather than relying on the event loop's default executor. The main reason I hadn't was because we hadn't implemented an asynchronous executor shutdown outside of loop.shutdown_default_executor(), but we could potentially move the functionality to a private function (in Lib/asyncio/base_events.py) so it's reusable for ThreadPool. It could be something like this:

    async def _shutdown_executor(executor, loop):
        future = loop.create_future()
        thread = threading.Thread(target=loop._do_shutdown, args=(executor,future))
        thread.start()
        try:
            await future
        finally:
            thread.join()

    def _do_shutdown(loop, executor, future):
        try:
            executor.shutdown(wait=True)
            loop.call_soon_threadsafe(future.set_result, None)
        except Exception as ex:
            loop.call_soon_threadsafe(future.set_exception, ex)

Note that the above would be for internal use only, for the existing loop.shutdown_default_executor() and the new asyncio.ThreadPool. For it to support both, it would have to accept an explicit loop argument. It also does not need a robust API, since it's private. 

> Shutting down the default executor seems unnecessary and could impact lower level code which is using it. The default executor is shutdown at the end of asyncio.run anyway.

I agree with your point regarding the shutdown of the default executor one. But I think we should shutdown the internal executor for the ThreadPool, as a main point context managers is to start and clean up their own resources. Also, I'm aware that asyncio.run() shuts down the default executor, I implemented that fairly recently in GH-15735. ;)

Another substantial concern is in the case of a coroutine that contains asyncio.ThreadPool being executed without asyncio.run(). There are still use cases for using the lower level loop.run_until_complete() for more complex asyncio programs. I don't think we should make asyncio.ThreadPool dependent on the coroutine being executed with asyncio.run(). Thus, it makes sense that ThreadPool should create a new instance of ThreadPoolExecutor upon entry of the context manager and then shut it down upon exit. 

> I'm not sure if a new ThreadPoolExecutor really needs to be created for each ThreadPool though.

IMO, a context manager should create and then finalize it's own resources, rather than sharing the same executor across contexts. Sharing the same one seems to defeat the purpose of using a context manager in the first place, no?

> Run method should be:

    async def run(self, func, *args, **kwargs):
        call = functools.partial(func, *args, **kwargs)
        return await self._loop.run_in_executor(None, call)

Correction: if we're using an internal executor now, this should instead be self._loop.run_in_executor(self._executor, call). With `None`, it will simply use the event loop's default executor rather the context manager's ThreadPoolExecutor.

> `run` should be awaitable method, see #38430

Agreed, good point. 

> Paul's version looks better.

I think he had some good points, particularly around using an internal executor instead the event loop's default executor; but there's some parts that I disagree with, see above reasons.

> 1. get_running_loop() should be used instead of get_event_loop()

Note: If get_running_loop() is used instead, it has to set self._loop within a coroutine (since get_running_loop() can only be used within coroutines), that's why in my version it was within __aenter__. I think this would make the most sense. 

> 2. There is no `await executer.shutdown()` API, the method is synchronous.

That's why in my version I was using the existing event loop API, since we had already implemented an asynchronous loop.shutdown_default_executor() method that calls executor.shutdown(). However, if we added a private _shutdown_executor() and _do_shutdown() as I mentioned above, that wouldn't be an issue. 

Thanks for the feedback on the prototype Paul and Andrew, both of you brought up some good points. I'll start working on a PR.

History
Date	User	Action	Args
2019-11-01 11:09:01	aeros	set	recipients: + aeros, asvetlov, yselivanov, primal
2019-11-01 11:09:01	aeros	set	messageid: <1572606541.89.0.153583932708.issue32309@roundup.psfhosted.org>
2019-11-01 11:09:01	aeros	link	issue32309 messages
2019-11-01 11:09:01	aeros	create