Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio: support fork #66197

Closed
vstinner opened this issue Jul 17, 2014 · 59 comments
Closed

asyncio: support fork #66197

vstinner opened this issue Jul 17, 2014 · 59 comments
Assignees

Comments

@vstinner
Copy link
Member

BPO 21998
Nosy @pitrou, @vstinner, @tiran, @1st1, @Martiusweb
Files
  • test2.py
  • close_self_pipe_after_selector.patch
  • at_fork.patch
  • at_fork-2.patch
  • at_fork-3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/1st1'
    closed_at = <Date 2020-02-10.23:51:31.002>
    created_at = <Date 2014-07-17.16:10:45.820>
    labels = ['expert-asyncio']
    title = 'asyncio: support fork'
    updated_at = <Date 2020-02-10.23:51:31.001>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-02-10.23:51:31.001>
    actor = 'vstinner'
    assignee = 'yselivanov'
    closed = True
    closed_date = <Date 2020-02-10.23:51:31.002>
    closer = 'vstinner'
    components = ['asyncio']
    creation = <Date 2014-07-17.16:10:45.820>
    creator = 'vstinner'
    dependencies = []
    files = ['37306', '37385', '38016', '38019', '38164']
    hgrepos = []
    issue_num = 21998
    keywords = ['patch']
    message_count = 56.0
    messages = ['223344', '223431', '225973', '226861', '226889', '226892', '226899', '226932', '231819', '231845', '231919', '231921', '231941', '232301', '232308', '235409', '235410', '235429', '236065', '236131', '236134', '236136', '236145', '236149', '236150', '236151', '244111', '244112', '244113', '244114', '244116', '244117', '244118', '244119', '244120', '244121', '244123', '244130', '244135', '249722', '249724', '249732', '256889', '297117', '297207', '297218', '297219', '297228', '297230', '297232', '297233', '297235', '313196', '325820', '325824', '361755']
    nosy_count = 9.0
    nosy_names = ['pitrou', 'vstinner', 'christian.heimes', 'zmedico', 'neologix', 'yselivanov', 'Adam.Bishop', 'martius', 'Christian H']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue21998'
    versions = ['Python 3.6']

    @vstinner
    Copy link
    Member Author

    It looks like asyncio does not suppport fork() currently: on fork(), the event loop of the parent and the child process share the same "self pipe".

    Example:
    ---

    import asyncio, os
    loop = asyncio.get_event_loop()
    pid = os.fork()
    if pid:
        print("parent", loop._csock.fileno(), loop._ssock.fileno())
    else:
        print("child", loop._csock.fileno(), loop._ssock.fileno())

    Output:
    ---
    parent 6 5
    child 6 5
    ---

    I'm not sure that it's revelant to use asyncio with fork, but I wanted to open an issue at least as a reminder.

    I found the "self pipe" + fork issue when reading the issue bpo-12060, while working on a fix for the signal handling race condition on FreeBSD: issue bpo-21645.

    @vstinner
    Copy link
    Member Author

    Oh, I wanted to use the atfork module for that, but then I saw that it does not exist :-(

    @vstinner
    Copy link
    Member Author

    A first step would be to document the issue in the "developer" section of asyncio documentation. Mention that the event loop should be closed and a new event loop should be created.

    @1st1
    Copy link
    Member

    1st1 commented Sep 14, 2014

    It's not that it doesn't work after fork, right? Should we add a recipe with pid monitoring a self-pipe re-initialization?

    @gvanrossum
    Copy link
    Member

    Actually I expect that if you share an event loop across different processes via form, everything's a mess -- whenever a FD becomes ready, won't both the parent and the child be woken up? Then both would attempt to read from it. One would probably get EWOULDBLOCK (assuming all FDs are actually in non-blocking mode) but it would still be a mess. The specific mess for the self-pipe would be that the race condition it's intended to solve might come back.

    It's possible that some polling syscall might have some kind of protection against forking, but the Python data structures that map FDs to handlers don't know that, so it would still be a mess.

    Pretty much the only thing you should expect to be able to do safely after forking is closing the event loop -- and I'm not even 100% sure that that's safe (I don't know what happens to a forked executor).

    Is there a use case for sharing an event loop across forking?

    @1st1
    Copy link
    Member

    1st1 commented Sep 15, 2014

    Is there a use case for sharing an event loop across forking?

    I don't think so. I use forking mainly for the following two use-cases:

    • Socket sharing for web servers. Solution: if you want to have a shared sockets between multiple child processes, just open them in master process, fork as many times as you want, and start event
      loops in child processes only.

    • If sometimes you want to spawn some child process "on demand". Solution: fork before having a loop running, and use the child process as a "template", i.e. when you need a new child process - just ask the first child to fork it.

    It would certainly be handy to have an ability to fork while the loop is running and safely close (and reopen) it in the forked child process. But now I can see that it's a non-trivial thing to do properly. Probably it's ~somewhat~ safe to re-initialize epoll (or whatever selector we use), re-open self-pipe etc, drop all queued callbacks and clear Task.all_tasks collection, but even then it's easy to miss something.

    I think we just need to make sure that we have documented that asyncio loops are not fork safe, and forks in running loops should be avoided by all means.

    @vstinner
    Copy link
    Member Author

    Is there a use case for sharing an event loop across forking?

    I don't know if asyncio must have a builtin support for this use case. The minimum is to document the behaviour, or maybe even suggest a recipe to support it.

    For example, an event loop of asyncio is not thread-safe and we don't want to support this use case. But I wrote a short documentation with code snippets to show how to workaround this issue:
    https://docs.python.org/dev/library/asyncio-dev.html#concurrency-and-multithreading

    We need a similar section to explain how to use asyncio with the os.fork() function and the multiprocesing module.

    @gvanrossum
    Copy link
    Member

    That sounds about right -- it's a doc issue. Let me propose a paragraph:

    """
    NOTE: It is not safe to share an asyncio event loop between processes that are related by os.fork(). If an event loop exists in a process, and that process is forked, the only safe operation on the loop in the child process is to call its close() method.
    """

    (I don't want to have to research what the various polling primitives do on fork(), so I don't want to suggest that it's okay to close the loop in the parent and use it in the child.)

    A similar note should probably be added to the docs for the selectors module.

    @Martiusweb
    Copy link
    Member

    Hi,

    Actually, closing and creating a new loop in the child doesn't work either, at least on Linux.

    When, in the child, we call loop.close(), it performs:
    self.remove_reader(self._ssock)
    (in selector_events.py, _close_self_pipe() around line 85)

    Both the parent and the child still refer to the same underlying epoll structure, at that moment, and calling remove_reader() has an effect on the parent process too (which will never watch the self-pipe again).

    I attached a test case that demonstrates the issue (and the workaround, commented).

    @gvanrossum
    Copy link
    Member

    Martin, what is the magic call to make in the child (or in the parent pre-fork???) to disable the epoll object in the child without disabling it in the parent?

    (Perhaps just closing the selector and letting the unregister calls fail would work?)

    @Martiusweb
    Copy link
    Member

    Guido,

    Currently in my program, I manually remove and then re-adds the reader to the loop in the parent process right after the fork(). I also considered a dirty monkey-patching of remove_reader() and remove_writer() which would act as the original versions but without removing the fds from the epoll object (ensuring I don't get bitten by the same behavior for an other fd).

    The easiest fix, I think, is indeed to close the selector without unregistering the fds, but I don't know if doing so would have undesired side effects on an other platform than Linux (resources leak, or the close call failing maybe).

    @Martiusweb
    Copy link
    Member

    I said something wrong in my previous comment: removing and re-adding the reader callback right after the fork() is obviously subject to a race condition.

    I'll go for the monkey patching.

    @gvanrossum
    Copy link
    Member

    How about not doing anything in the parent but in the child, closing the
    selector and then the event loop?

    On Mon, Dec 1, 2014 at 1:29 AM, Martin Richard <report@bugs.python.org>
    wrote:

    Martin Richard added the comment:

    I said something wrong in my previous comment: removing and re-adding the
    reader callback right after the fork() is obviously subject to a race
    condition.

    I'll go for the monkey patching.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue21998\>


    @Martiusweb
    Copy link
    Member

    Currently, this is what I do in the child after the fork:

    >> selector = loop._selector
    >> parent_class = selector.__class__.__bases__[0]
    >> selector.unregister = lambda fd: parent_class.unregister(selector, fd)

    It replaces unregister() by _BaseSelectorImpl.unregister(), so "our" data structures are still cleaned (the dict _fd_to_key, for instance).

    If a fix for this issue is desired in tulip, the first solution proposed by Guido (closing the selector and let the unregister call fail, see the -trivial- patch attached) is probably good enough.

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 8, 2014

    I suggest to split this issue: create a new issue focus on selectors.EpollSelector, it doesn't behave well with forking. If I understood correctly, you can workaround this specific issue by forcing the selector to selectors.SelectSelector for example, right?

    @vstinner
    Copy link
    Member Author

    vstinner commented Feb 4, 2015

    Attached at_fork.patch: detect fork and handle fork.

    • Add _at_fork() method to asyncio.BaseEventLoop
    • Add _detect_fork() method to asyncio.BaseEventLoop
    • Add _at_fork() method to selectors.BaseSelector

    I tried to minimize the number of calls to _detect_fork(): only when the self-pipe or the selector is used.

    I only tried test2.py. More tests using two processes running two event loops should be done, and non-regression tests should be written.

    The issue bpo-22087 (multiprocessing) looks like a duplicate of this issue.

    @vstinner
    Copy link
    Member Author

    vstinner commented Feb 4, 2015

    close_self_pipe_after_selector.patch only fixes test2.py, it doesn't fix the general case: run the "same" event loop in two different event loops.

    @vstinner
    Copy link
    Member Author

    vstinner commented Feb 5, 2015

    Updated patch:

    • drop PollSelector._at_fork(): PollSelector is not shared with the parent process
    • _at_fork() of BaseEventLoop and SelectorEventLoop now do nothing by default: only _UnixSelectorEventLoop._at_fork() handle the fork, nothing is needed on Windows
    • patch written for Python 3.5, not for Tulip (different directories)

    @vstinner vstinner changed the title asyncio: a new self-pipe should be created in the child process after fork asyncio: support fork Feb 15, 2015
    @vstinner
    Copy link
    Member Author

    Can someone review at_fork-2.patch?

    Martin: Can you please try my patch?

    @Martiusweb
    Copy link
    Member

    I read the patch, it looks good to me for python 3.5. It will (obviously) not work with python 3.4 since self._selector won't have an _at_fork() method.

    I ran the tests on my project with python 3.5a1 and the patch, it seems to work as expected: ie. when I close the loop of the parent process in the child, it does not affect the parent.

    I don't have a case where the loop of the parent is still used in the child though.

    @vstinner
    Copy link
    Member Author

    It will (obviously) not work with python 3.4 since self._selector won't have an _at_fork() method.

    asyncio doc contains:
    "The asyncio package has been included in the standard library on a provisional basis. Backwards incompatible changes (up to and including removal of the module) may occur if deemed necessary by the core developers."

    It's not the case for selectors. Even if it would be possible to implement selector._at_fork() in asyncio, it would make more sense to implement it in the selectors module.

    @neologix: Would you be ok to add a *private* _at_fork() method to selectors classes in Python 3.4 to fix this issue?

    I know that you are not a fan of fork, me neither, but users like to do crazy things with fork and then report bugs to asyncio :-)

    @Martiusweb
    Copy link
    Member

    In that case, I suggest a small addition to your patch that would do the trick:

    in unix_events.py:
    + def _at_fork(self):
    + super()._at_fork()
    + self._selector._at_fork()
    + self._close_self_pipe()
    + self._make_self_pipe()
    +

    becomes:

    + def _at_fork(self):
    + super()._at_fork()
    + if not hasattr(self._selector, '_at_fork'):
    + return
    + self._selector._at_fork()
    + self._close_self_pipe()
    + self._make_self_pipe()

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Feb 17, 2015

    @neologix: Would you be ok to add a *private* _at_fork() method to selectors classes in Python 3.4 to fix this issue?

    Not really: after fork(), you're hosed anyway:
    """

       Q6  Will closing a file descriptor cause it to be removed from
    

    all epoll sets automatically?

       A6  Yes, but be aware of the following point.  A file
    

    descriptor is a reference to an open file description
    (see open(2)). Whenever a descriptor is duplicated via
    dup(2), dup2(2), fcntl(2) F_DUPFD, or fork(2), a
    new file descriptor referring to the same open file
    description is created. An open file description
    continues to exist until all file descriptors referring
    to it have been closed. A file descriptor is
    removed from an epoll set only after all the file
    descriptors referring to the underlying open file
    description have been closed (or before if the
    descriptor is explicitly removed using epoll_ctl(2)
    EPOLL_CTL_DEL). This means that even after a file
    descriptor that is part of an epoll set has been
    closed, events may be reported for that file descriptor if
    other file descriptors referring to the same
    underlying file description remain open.
    """

    What would you do with the selector after fork(): register the FDs in
    a new epoll, remove them?

    There's no sensible default behavior, and I'd rrather avoid polluting
    the code for this.
    If asyncio wants to support this, it can create a new selector and
    re-register everything it wants manually: there's a Selector.get_map()
    exposing all that's needed.

    @vstinner
    Copy link
    Member Author

    2015-02-17 20:16 GMT+01:00 Charles-François Natali <report@bugs.python.org>:

    What would you do with the selector after fork(): register the FDs in
    a new epoll, remove them?

    See the patch:

    + def _at_fork(self):
    + # don't unregister file descriptors: epoll is still shared with
    + # the parent process
    + self._epoll = select.epoll()
    + for key in self._fd_to_key.values():
    + self._register(key)

    EpollSelector._at_fork() does nothing on the current epoll object,
    create a new epoll object and register again all file descriptor.

    Hum, I should maybe close explicitly the old epoll object.

    There's no sensible default behavior, and I'd rrather avoid polluting
    the code for this.

    What is wrong with the proposed patch?

    If asyncio wants to support this, it can create a new selector and
    re-register everything it wants manually: there's a Selector.get_map()
    exposing all that's needed.

    If possible, I would prefer to implement "at fork" in the selectors
    module directly, the selectors module has a better knowledge of
    seletors. For example, asyncio is not aware of the selector._epoll
    attribute.

    @Martiusweb
    Copy link
    Member

    The goal of the patch is to create a duplicate selector (a new epoll() structure with the same watched fds as the original epoll). It allows to remove fds watched in the child's loop without impacting the parent process.

    Actually, it's true that with the current implementation of the selectors module (using get_map()), we can achieve the same result than with victor's patch without touching the selector module. I attached a patch doing that, also working with python 3.4.

    I thought about this at_fork() mechanism a bit more and I'm not sure of what we want to achieve with this. In my opinion, most of the time, we will want to recycle the loop in the child process (close it and create a new one) because we will not want to have the tasks and callbacks scheduled on the loop running on both the parent and the child (it would probably result in double writes on sockets, or double reads, for instance).

    With the current implementation of asyncio, I can't recycle the loop for a single reason: closing the loop calls _close_self_pipe() which unregisters the pipe of the selector (hence breaking the loop in the parent). Since the self pipe is an object internal to the loop, I think it's safe to close the pipes without unregistering them of the selector. It is at least true with epoll() according to the documentation quoted by neologix, but I hope that we can expect it to be true with other unix platforms too.

    @vstinner
    Copy link
    Member Author

    How do other event loops handle fork? Twisted, Tornado, libuv, libev,
    libevent, etc.

    @1st1
    Copy link
    Member

    1st1 commented May 26, 2015

    How do other event loops handle fork? Twisted, Tornado, libuv, libev,
    libevent, etc.

    It looks like using fork() while an event loop is running isn't recommended in any of the above. If I understand the code correctly, libev & gevent reinitialize loops in the forked process (essentially, you have a new loop).

    I think we have the following options:

    1. Document that using fork() is not recommended.

    2. Detect fork() and re-initialize event loop in the child process (cleaning-up callback queues, re-initializing selectors, creating new self-pipe).

    3. Detect fork() and raise a RuntimeError. Document that asyncio event loop does not support forking at all.

    4. The most recent patch by Martin detects the fork() and reinitializes self-pipe and selector (although all FDs are kept in the new selector). I'm not sure I understand this option.

    I'm torn between 2 & 3. Guido, Victor, Martin, what do you think?

    @Martiusweb
    Copy link
    Member

    Hi,

    My patch was a variation of haypo's patch. The goal was to duplicate the
    loop and its internal objects (loop and self pipes) without changing much
    to its state from the outside (keeping callbacks and active tasks). I
    wanted to be conservative with this patch, but it is not the option I
    prefer.

    I think that raising a RuntimeError in the child is fine, but may not be
    enough:

    Imho, saying "the loop can't be used anymore in the child" is fine, but "a
    process in which lives an asyncio loop must not be forked" is too
    restrictive (I'm not thinking of the fork+exec case, which is probably fine
    anyway) because a library may rely on child processes, for instance.

    Hence, we should allow a program to fork and eventually dispose the
    resources of the loop by calling loop.close() - or any other mechanism that
    you see fit (clearing all references to the loop is tedious because of the
    global default event loop and the cycles between futures/tasks and the
    loop).

    However, the normal loop.close() sequence will unregister all the fds
    registered to the selector, which will impact the parent. Under Linux with
    epoll, it's fine if we only close the selector.

    I would therefore, in the child after a fork, close the loop without
    breaking the selector state (closing without unregister()'ing fds), unset
    the default loop so get_event_loop() would create a new loop, then raise
    RuntimeError.

    I can elaborate on the use case I care about, but in a nutshell, doing so
    would allow to spawn worker processes able to create their own loop without
    requiring an idle "blank" child process that would be used as a base for
    the workers. It adds the benefit, for instance, of allowing to share data
    between the parent and the child leveraging OS copy-on-write.

    2015-05-26 18:20 GMT+02:00 Yury Selivanov <report@bugs.python.org>:

    Yury Selivanov added the comment:

    > How do other event loops handle fork? Twisted, Tornado, libuv, libev,
    libevent, etc.

    It looks like using fork() while an event loop is running isn't
    recommended in any of the above. If I understand the code correctly, libev
    & gevent reinitialize loops in the forked process (essentially, you have a
    new loop).

    I think we have the following options:

    1. Document that using fork() is not recommended.

    2. Detect fork() and re-initialize event loop in the child process
      (cleaning-up callback queues, re-initializing selectors, creating new
      self-pipe).

    3. Detect fork() and raise a RuntimeError. Document that asyncio event
      loop does not support forking at all.

    4. The most recent patch by Martin detects the fork() and reinitializes
      self-pipe and selector (although all FDs are kept in the new selector).
      I'm not sure I understand this option.

    I'm torn between 2 & 3. Guido, Victor, Martin, what do you think?

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue21998\>


    @1st1
    Copy link
    Member

    1st1 commented May 26, 2015

    I would therefore, in the child after a fork, close the loop without
    breaking the selector state (closing without unregister()'ing fds), unset
    the default loop so get_event_loop() would create a new loop, then raise
    RuntimeError.

    I can elaborate on the use case I care about, but in a nutshell, doing so
    would allow to spawn worker processes able to create their own loop without
    requiring an idle "blank" child process that would be used as a base for
    the workers. It adds the benefit, for instance, of allowing to share data
    between the parent and the child leveraging OS copy-on-write.

    The only solution to safely fork a process is to fix loop.close() to
    check if it's called from a forked process and to close the loop in
    a safe way (to avoid breaking the master process). In this case
    we don't even need to throw a RuntimeError. But we won't have a
    chance to guarantee that all resources will be freed correctly (right?)

    So the idea is (I guess it's the 5th option):

    1. If the forked child doesn't call loop.close() immediately after
      forking we raise RuntimeError on first loop operation.

    2. If the forked child calls (explicitly) loop.close() -- it's fine,
      we just close it, the error won't be raised. When we close we only
      close the selector (without unregistering or re-regestering any FDs),
      we cleanup callback queues without trying to close anything).

    Guido, do you still think that raising a "RuntimeError" in a child
    process in an unavoidable way is a better option?

    @Martiusweb
    Copy link
    Member

    015-05-26 20:40 GMT+02:00 Yury Selivanov <report@bugs.python.org>:

    Yury Selivanov added the comment:
    The only solution to safely fork a process is to fix loop.close() to
    check if it's called from a forked process and to close the loop in
    a safe way (to avoid breaking the master process). In this case
    we don't even need to throw a RuntimeError. But we won't have a
    chance to guarantee that all resources will be freed correctly (right?)

    If all the tasks are cancelled and loop's internal structures (callback
    lists, tasks sets, etc) are cleared, I believe that the garbage collector
    will eventually be able to dispose everything.

    However, it's indeed not enough: resources created by other parts of
    asyncio may leak (transports, subprocess). For instance, I proposed to add
    a "detach()" method for SubprocessTransport here:
    http://bugs.python.org/issue23540 : in this case, I need to close stdin,
    stdout, stderr pipes without killing the subprocess.

    So the idea is (I guess it's the 5th option):

    1. If the forked child doesn't call loop.close() immediately after
      forking we raise RuntimeError on first loop operation.

    2. If the forked child calls (explicitly) loop.close() -- it's fine,
      we just close it, the error won't be raised. When we close we only
      close the selector (without unregistering or re-regestering any FDs),
      we cleanup callback queues without trying to close anything).

    Guido, do you still think that raising a "RuntimeError" in a child
    process in an unavoidable way is a better option?

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue21998\>


    @gvanrossum
    Copy link
    Member

    I don't actually know if the 5th option is possible. My strong requirement is that no matter what the child process does, the parent should still be able to continue using the loop. IMO it's better to leak a FD in the child than to close a resource owned by the parent. Within those constraints I'm okay with various solutions.

    @larryhastings
    Copy link
    Contributor

    Surely this is too late for 3.5?

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 4, 2015

    Surely this is too late for 3.5?

    I'm not 100% convinced that asyncio must support fork, so it's too late :-) Anyway, we don't care, asyncio will be under provisional status for one more cycle (3.5) :-p

    @larryhastings
    Copy link
    Contributor

    I've remarked it as "normal" priority and moved it to 3.6. Not my problem anymore! :D

    @AdamBishop
    Copy link
    Mannequin

    AdamBishop mannequin commented Dec 23, 2015

    A note about this issue should really be added to the documentation - on OS X, it fails with the rather non-sensical "OSError: [Errno 9] Bad file descriptor", making this very hard to debug.

    I don't have any specific requirement for fork support in asyncio as it's trivial to move loop creation after the fork, but having to run the interpreter through GDB to diagnose the problem is not a good state of affairs.

    @vstinner
    Copy link
    Member Author

    Python 3.7 got as new os.register_at_fork() function. I don't know if it could help:
    https://docs.python.org/dev/library/os.html#os.register_at_fork

    Can we close this issue? Sorry, I lost track of this issue and I see no activity since the end of 2015...

    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2017

    Python 3.7 got as new os.register_at_fork() function. I don't know if it could help:

    The most reasonable IMHO would be for it to mark the event loop "broken" (or closed?) in the child, to forbid any further use.

    By the way, on Python 3 (which is pretty much required by asyncio), I really suggest using the "forkserver" method of multiprocessing, it removes a ton of hassle with inheritance through forking.

    @vstinner
    Copy link
    Member Author

    The most reasonable IMHO would be for it to mark the event loop "broken" (or closed?) in the child, to forbid any further use.

    Hum, the problem is that Python cannot guess if the event loop will be
    used in the child or the parent process :-/ The problem only occurs
    when the event loop is used in the two processes, no?

    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2017

    Possible answer: have a global WeakSet of event loops. In the child fork handler, iterate over all event loops and "break" those that have already been started.

    @1st1
    Copy link
    Member

    1st1 commented Jun 28, 2017

    Possible answer: have a global WeakSet of event loops. In the child fork handler, iterate over all event loops and "break" those that have already been started.

    We can do this but only in debug mode.

    @vstinner
    Copy link
    Member Author

    A compromise for the short term would be to detect fork in debug mode
    and raise an exception, and explain how to fix such bug. What do you
    think?

    @1st1
    Copy link
    Member

    1st1 commented Jun 28, 2017

    A compromise for the short term would be to detect fork in debug mode
    and raise an exception, and explain how to fix such bug. What do you
    think?

    I'd prefer to fix it properly in 3.7.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2017

    I'm not sure why it would be debug-only. You usually don't fork() often, and you don't have many event loops around, so the feature sounds cheap.

    In any case, I'm not directly affected by this issue, I'm merely suggesting options.

    @1st1
    Copy link
    Member

    1st1 commented Jun 28, 2017

    I'm not sure why it would be debug-only. You usually don't fork() often, and you don't have many event loops around, so the feature sounds cheap.

    I think you're right. If it's low or zero overhead we can have the check always enabled.

    @zmedico
    Copy link
    Mannequin

    zmedico mannequin commented Mar 4, 2018

    I'm not sure about possible use cases that might conflict with this approach, but using a separate event loop for each pid seems very reasonable to me, as follows:

    _default_policy = asyncio.get_event_loop_policy()
    _pid_loop = {}
    
    class MultiprocessingPolicy(asyncio.AbstractEventLoopPolicy):
    	def get_event_loop(self):
    		pid = os.getpid()
    		loop = _pid_loop.get(pid)
    		if loop is None:
    			loop = self.new_event_loop()
    			_pid_loop[pid] = loop
    		return loop
    
    	def new_event_loop(self):
    		return _default_policy.new_event_loop()
    
    asyncio.set_event_loop_policy(MultiprocessingPolicy())

    @vstinner
    Copy link
    Member Author

    I'm torn between 2 & 3. Guido, Victor, Martin, what do you think?

    Give up, document that fork() is not supported and close the issue :-) IMHO it's not worth it.

    @1st1
    Copy link
    Member

    1st1 commented Sep 19, 2018

    I'll revisit this later.

    @vstinner
    Copy link
    Member Author

    There is no activity for 2 years. Asyncio is mature now. It seems like users learnt how to work around this issue, or don't use fork() with asyncio.

    I close the issue.

    @kristjanvalur
    Copy link
    Contributor

    As far as I can tell, the problems of fork() and asyncio are not documented. It would be useful at least to have it mentioned in the docs that forking a running event loop is not supported.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 24, 2023

    Well, fork() is more generally discouraged especially in networking applications. If you're using multiprocessing, the "forkserver" method is a reliable and safe alternative.

    That said, I agree it would be useful to document. Perhaps you would like to open a separate issue for that?

    @kristjanvalur
    Copy link
    Contributor

    sounds fine

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants