Celery 5 is going async and in order to isolate the main event loop from task execution, the tasks are going to be executed in a different thread with it's own event loop.

This thread may or may not be CPU bound.
The main thread is I/O bound.

This patch should help a lot.

I like Nir's approach a lot (although I haven't looked into the patch itself yet). It's pretty novel.
David's patch is also very interesting.

I'm willing to help.
