classification
Title: asyncio: support multiprocessing (support fork)
Type: behavior Stage: patch review
Components: asyncio Versions: Python 3.6, Python 3.5, Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: dan.oreilly, gvanrossum, miss-islington, pitrou, thehesiod, yselivanov, zmedico
Priority: normal Keywords: patch

Created on 2014-07-26 18:01 by dan.oreilly, last changed 2018-05-30 00:56 by yselivanov.

Files
File name Uploaded Description Edit
test_loop.py dan.oreilly, 2014-07-26 18:01 Test script demonstrating the issue
handle_mp_unix.diff dan.oreilly, 2014-07-26 18:20 Patch that makes _UnixDefaultEventLoopPolicy create a new loop object if get_event_loop is called in a forked mp child process review
handle-mp_unix2.patch dan.oreilly, 2014-07-26 20:13 Use os.getpid() instead of multiprocessing. Store pid state in Policy instance rather than the Loop instance. review
handle_mp_unix_with_test.diff dan.oreilly, 2014-07-27 16:09 Adds a unit test to previous patch review
Pull Requests
URL Status Linked Edit
PR 7208 merged yselivanov, 2018-05-29 16:08
PR 7215 merged miss-islington, 2018-05-29 19:39
PR 7218 closed yselivanov, 2018-05-29 20:08
PR 7226 closed yselivanov, 2018-05-29 22:02
PR 7232 merged yselivanov, 2018-05-29 23:35
PR 7233 merged yselivanov, 2018-05-29 23:36
Messages (23)
msg224082 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-07-26 18:01
On non-Windows platforms, if a user attempts to use asyncio.get_event_loop() in a child process created by multiprocessing.Process using the fork context, and an asyncio event loop is also being used in the main process, the same _UnixSelectorEventLoop object will be used by both processes. This, of course, won't work properly; the child will raise a "RuntimeError: Event loop is running" exception as soon as it tries using the loop object.

However, this may or may not actually make it back to the parent: If the parent is expecting to get items from a queue from that child publishes to, rather than yielding from it immediately, the program will deadlock. Even if the child is yielded from, it may not be immediately obvious why "Event loop is running" was raised, and the behavior is inconsistent with the behavior if a method other than os.fork is used to create the child process, since the child will get a new event loop in that case.

So, it'd be better if _UnixDefaultEventLoopPolicy detected that get_event_loop was being called in a child process, and either 

1) Created a new loop for the child (this would make the behavior appear consistent no matter what platform/method for launching children is used)
2) Raised an exception stating that no default event loop exists for this process, similar to the assert used for threads currently.

I've attached a test script that demonstrates the different between forked/spawned processes, and a patch that implements #1 above.
msg224084 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-07-26 19:55
Good point.  Asyncio definitely should not share event loops across forked processes.  However, I don't like the dependency on multiprocessing (even though it's in the stdlib) -- can't the policy just use os.getpid()?

Also, I've got a feeling that maybe the pid should be part of the policy state instead of the loop state?  The policy could just reset self._local when the pid doesn't match.
msg224085 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-07-26 20:13
Yep, agreed on both points. The latter suggestion also has the benefit of not requiring any test changes. Here's an updated patch.
msg224097 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-07-27 03:12
I think there should still be a new unittest -- we're adding a behavior we're promising, so we should test it.
msg224125 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-27 12:31
See aslo issue #21998: "asyncio: a new self-pipe should be created in the child process after fork".
msg224140 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-07-27 16:09
I've added a unit test that spawns a new forked process via multiprocessing, and verifies that the loop returned by get_event_loop is not the same as the one we have in the parent.
msg224143 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-07-27 16:21
re: #21998, perhaps it's time to revive #16500? Without that, I'm not sure what can be done aside from documenting the need to call "loop = asyncio.get_event_loop()" in the child immediately after forking.
msg224144 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-07-27 16:45
A simple pid check in the policy should be enough.
msg224145 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-07-27 16:51
Hmm, I'm not sure what you mean. What check in the policy would prevent this issue you described in #21998?:

import asyncio, os
loop = asyncio.get_event_loop()
pid = os.fork()
if pid:
    print("parent", loop._csock.fileno(), loop._ssock.fileno())
else:
    print("child", loop._csock.fileno(), loop._ssock.fileno())

Output:
---
parent 6 5
child 6 5
msg226698 - (view) Author: Dan O'Reilly (dan.oreilly) * Date: 2014-09-10 15:13
Are any other changes needed here? I'm still not completely clear on what Victor meant with his last comment.
msg235404 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-02-04 21:06
This issue looks to be a duplicate of #21998.

handle-mp_unix2.patch looks more to a workaround than a real issue. When I write asyncio code, I prefer to pass explicitly the loop, so get_event_loop() should never be called. IMO the methods of the event loop should detect the fork and handle the fork directly.
msg235411 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-02-04 22:58
See also the https://pypi.python.org/pypi/mpworker project
msg288327 - (view) Author: Alexander Mohr (thehesiod) * Date: 2017-02-22 02:26
I believe this is now worse due to https://github.com/python/asyncio/pull/452

before I was able to simply create a new run loop from sub-processes however you will now get the error "Cannot run the event loop while another loop is running".  The state of the run loop should not be preserved in sub-processes either.
msg297222 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-06-28 20:20
The approach in Dan's patches looks sane to me and I agree the issue needs fixing.  Dan, would you like to submit a PR for this?  We're using Github for patch submissions now.
msg297226 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-06-28 21:05
> The approach in Dan's patches looks sane to me and I agree the issue needs fixing.  Dan, would you like to submit a PR for this?  We're using Github for patch submissions now.


Mind that the patch is very outdated and we already have these checks in get_event_loop.
msg297227 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-06-28 21:09
It seems there's a check in the top-level get_event_loop() function but not in the DefaultEventLoopPolicy.get_event_loop() method.

(also: wow, that stuff is complicated)
msg297229 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-06-28 21:13
> It seems there's a check in the top-level get_event_loop() function but not in the DefaultEventLoopPolicy.get_event_loop() method.

Yes, as users aren't supposed to work with policies directly.

> (also: wow, that stuff is complicated)

Yep. Please assign me to review if you submit a PR.
msg318077 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 19:38
New changeset 5d97b7bcc19496617bf8c448d2f149cc28c73bc7 by Yury Selivanov in branch 'master':
bpo-22087: Fix Policy.get_event_loop() to detect fork (GH-7208)
https://github.com/python/cpython/commit/5d97b7bcc19496617bf8c448d2f149cc28c73bc7
msg318092 - (view) Author: miss-islington (miss-islington) Date: 2018-05-29 21:09
New changeset 2a7eb0b531656f4a77d85078e6e009e4b3639ef9 by Miss Islington (bot) in branch '3.7':
bpo-22087: Fix Policy.get_event_loop() to detect fork (GH-7208)
https://github.com/python/cpython/commit/2a7eb0b531656f4a77d85078e6e009e4b3639ef9
msg318135 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 22:46
another related issue: https://bugs.python.org/issue33688
msg318140 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 23:43
Even though I committed a version of Dan's patch to 3.7 and 3.8, I've finally decided to revert it and do it properly instead.  We should strive to implement a proper solution, not commit some half-working code.

A concrete plan (for Python 3.8 probably):

1. Fix BaseDefaultEventLoopPolicy to track PID in its 'get_event_loop()' and 'set_event_loop()' methods.  If a PID has changed since the last invocation: reset its internal local state.

2. Fix _UnixDefaultEventLoopPolicy to check for PID change in 'get_child_watcher()' and 'set_child_watcher()'.

3. Fix child watcher / event loops to track PID changes too to avoid listening for child processes of their parent process.

4. Look at how libuv and other event loops implement fork support. Ideally we should be able to shutdown selectors (epoll, kqueue) in forked processes in such a way that their parent process isn't affected.

5. Think how we can make 'asyncio.run' fork friendly. Ideally it should initialize its own child watcher and remove it when its done.
msg318143 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-30 00:47
New changeset 99279ad823a758288e4e41962abfc4dad8943ce8 by Yury Selivanov in branch 'master':
Revert "bpo-22087: Fix Policy.get_event_loop() to detect fork (GH-7208)" (#7232)
https://github.com/python/cpython/commit/99279ad823a758288e4e41962abfc4dad8943ce8
msg318144 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-30 00:56
New changeset af9cda9845666e2f704177a431d29f91efbf828a by Yury Selivanov in branch '3.7':
Revert "bpo-22087: Fix Policy.get_event_loop() to detect fork (GH-7208)" (GH-7233)
https://github.com/python/cpython/commit/af9cda9845666e2f704177a431d29f91efbf828a
History
Date User Action Args
2018-05-30 00:56:36yselivanovsetmessages: + msg318144
2018-05-30 00:47:59yselivanovsetmessages: + msg318143
2018-05-29 23:43:15yselivanovsetmessages: + msg318140
2018-05-29 23:36:43yselivanovsetpull_requests: + pull_request6865
2018-05-29 23:35:07yselivanovsetpull_requests: + pull_request6864
2018-05-29 22:46:58yselivanovsetmessages: + msg318135
2018-05-29 22:23:08vstinnersetnosy: - vstinner
2018-05-29 22:02:47yselivanovsetpull_requests: + pull_request6858
2018-05-29 21:09:24miss-islingtonsetnosy: + miss-islington
messages: + msg318092
2018-05-29 20:08:45yselivanovsetpull_requests: + pull_request6851
2018-05-29 19:39:27miss-islingtonsetpull_requests: + pull_request6847
2018-05-29 19:38:10yselivanovsetmessages: + msg318077
2018-05-29 16:08:28yselivanovsetstage: needs patch -> patch review
pull_requests: + pull_request6843
2018-03-11 10:38:30zmedicosetnosy: + zmedico
2017-06-28 21:13:11yselivanovsetmessages: + msg297229
2017-06-28 21:09:48pitrousetmessages: + msg297227
2017-06-28 21:05:30yselivanovsetmessages: + msg297226
2017-06-28 20:20:51pitrousetnosy: + pitrou

messages: + msg297222
stage: needs patch
2017-02-22 02:26:04thehesiodsetnosy: + thehesiod

messages: + msg288327
versions: + Python 3.6
2015-02-15 20:19:43vstinnersettitle: asyncio: support multiprocessing (support fork=) -> asyncio: support multiprocessing (support fork)
2015-02-15 20:19:39vstinnersettitle: asyncio: support multiprocessing -> asyncio: support multiprocessing (support fork=)
2015-02-04 22:58:05vstinnersetmessages: + msg235411
2015-02-04 22:19:32vstinnersettitle: _UnixDefaultEventLoopPolicy should either create a new loop or explicilty fail when get_event_loop() is called from a multiprocessing child process -> asyncio: support multiprocessing
2015-02-04 21:06:34vstinnersetmessages: + msg235404
2014-09-10 15:13:47dan.oreillysetmessages: + msg226698
2014-07-27 16:51:39dan.oreillysetmessages: + msg224145
2014-07-27 16:45:55vstinnersetmessages: + msg224144
2014-07-27 16:21:36dan.oreillysetmessages: + msg224143
2014-07-27 16:09:52dan.oreillysetfiles: + handle_mp_unix_with_test.diff

messages: + msg224140
2014-07-27 12:31:05vstinnersetmessages: + msg224125
2014-07-27 03:12:58gvanrossumsetmessages: + msg224097
2014-07-26 20:13:57dan.oreillysetfiles: + handle-mp_unix2.patch

messages: + msg224085
2014-07-26 19:55:30gvanrossumsetmessages: + msg224084
2014-07-26 18:20:15dan.oreillysetfiles: + handle_mp_unix.diff
2014-07-26 18:18:37dan.oreillysetfiles: - handle_mp_unix.diff
2014-07-26 18:04:03dan.oreillysettitle: _UnixDefaultEventLoop policy should either create a new loop or explicilty fail when get_event_loop() is called from a multiprocessing child process -> _UnixDefaultEventLoopPolicy should either create a new loop or explicilty fail when get_event_loop() is called from a multiprocessing child process
2014-07-26 18:02:47dan.oreillysettype: behavior
2014-07-26 18:01:38dan.oreillysetfiles: + test_loop.py
2014-07-26 18:01:10dan.oreillycreate