Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python crashes on macOS after fork with no exec #77906

Closed
kapilt mannequin opened this issue Jun 1, 2018 · 70 comments
Closed

Python crashes on macOS after fork with no exec #77906

kapilt mannequin opened this issue Jun 1, 2018 · 70 comments
Labels
3.8 only security fixes OS-mac type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@kapilt
Copy link
Mannequin

kapilt mannequin commented Jun 1, 2018

BPO 33725
Nosy @ronaldoussoren, @ned-deily
PRs
  • bpo-33725: skip test_multiprocessing_fork on macOS #11043
  • [3.7] bpo-33725: skip test_multiprocessing_fork on macOS (GH-11043) #11044
  • [3.6] bpo-33725: skip test_multiprocessing_fork on macOS (GH-11043) #11045
  • bpo-33725: multiprocessing uses spawn by default on macOS #13603
  • [3.7] bpo-33725: multiprocessing uses spawn by default on macOS (GH-13603) #13626
  • bpo-33725, multiprocessing doc: rephase warning against fork on macOS #13841
  • [3.8] bpo-33725, multiprocessing doc: rephase warning against fork on macOS (GH-13841) #13849
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-05-29.18:09:22.220>
    created_at = <Date 2018-06-01.00:53:06.418>
    labels = ['OS-mac', '3.8', 'type-crash']
    title = 'Python crashes on macOS after fork with no exec'
    updated_at = <Date 2021-11-04.14:32:41.053>
    user = 'https://github.com/kapilt'

    bugs.python.org fields:

    activity = <Date 2021-11-04.14:32:41.053>
    actor = 'eryksun'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-05-29.18:09:22.220>
    closer = 'barry'
    components = ['macOS']
    creation = <Date 2018-06-01.00:53:06.418>
    creator = 'kapilt'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33725
    keywords = ['patch']
    message_count = 70.0
    messages = ['318352', '318361', '318396', '318397', '318528', '318529', '318708', '329871', '329880', '329885', '329919', '329922', '329923', '329926', '329927', '329933', '329941', '331101', '331406', '331407', '331409', '331411', '331435', '331438', '331459', '331610', '331733', '331735', '337587', '337591', '337733', '338819', '338873', '341452', '341455', '341475', '342042', '342071', '342412', '343704', '343773', '343779', '343782', '343807', '343826', '343828', '343830', '343832', '343833', '343838', '343841', '343842', '343844', '343895', '343898', '344590', '344608', '344710', '344762', '344763', '345841', '365249', '365251', '365252', '365262', '365263', '365266', '365281', '370296', '370331']
    nosy_count = 2.0
    nosy_names = ['ronaldoussoren', 'ned.deily']
    pr_nums = ['11043', '11044', '11045', '13603', '13626', '13841', '13849']
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue33725'
    versions = ['Python 3.8']

    @kapilt
    Copy link
    Mannequin Author

    kapilt mannequin commented Jun 1, 2018

    This issue seems to be reported a few times on various githubs projects. I've also reproduced using a brew install of python 2.7.15. I haven't been able to reproduce with python 3.6. Note this requires a framework build of python.

    Background on the underlying issue cause due to a change in high Sierra
    http://sealiesoftware.com/blog/archive/2017/6/5/Objective-C_and_fork_in_macOS_1013.html
    A ruby perspective on the same issue exhibiting for some apps
    https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/

    The work around seems to be setting an environment variable OBJC_DISABLE_INITIALIZE_FORK_SAFETY prior to executing python.

    Other reports

    https://bugs.python.org/issue30837
    ansible/ansible#32499
    imWildCat/scylla#22
    elastic/beats-tester#73
    jhaals/ansible-vault#60

    @kapilt kapilt mannequin added the OS-mac label Jun 1, 2018
    @ronaldoussoren
    Copy link
    Contributor

    A better solution is to avoid using fork mode for multiprocessing. The spawn and fork server modes should work fine.

    The underlying problem is that macOS system frameworks (basically anything higher level than libc) are not save wrt fork(2) and fixing that appears to have no priority at all at Apple.

    @ned-deily
    Copy link
    Member

    (As a side note, the macOS Pythons provided by python.org installers should not behave differently on macOS 10.13 High Sierra since none of them are built with a 10.13 SDK.)

    @pitrou
    Copy link
    Member

    pitrou commented Jun 1, 2018

    I understand that Apple, with their limited resources, cannot spend expensive engineer manpower on improving POSIX support in macOS </snark>.

    In any case, I'm unsure this bug can be fixed at the Python level. If macOS APIs don't like fork(), they don't like fork(), point bar. As Ronald says, on 3.x you should use "forkserver" (for multiple reasons, not only this issue). On 2.7 you're stuck dealing with the issue by yourself.

    @ronaldoussoren
    Copy link
    Contributor

    Antoine, the issue is not necessarily related to POSIX compliance, AFAIK strictly POSIX compliant code should work just fine. The problem is in higher-level APIs (CoreFoundation, Foundation, AppKit, ...), and appears to be related to using multi-threading in those libraries without spending effort on pre/post fork handlers to ensure that new processes are in a sane state after fork(). In older macOS versions this could result in hard to debug issues, in newer versions APIs seem to guard against this by aborting when the detect that the pid changed.

    Anyways... I agree that we shouldn't try to work around this in CPython, there's bound to more problems that are hidden with the proposed workaround.

    ---

    <http://www.sealiesoftware.com/blog/archive/2017/6/5/Objective-C_and_fork_in_macOS_1013.html\> describes what the environment variable does, and this "just" changes behavior of the ObjC runtime, and doesn't make using macOS system frameworks after a fork saver.

    @ronaldoussoren
    Copy link
    Contributor

    @ned: In the long run the macOS installers should be build using the latest SDK, primarily to get full API coverage and access to all system APIs.

    AFAIK building using the macOS 10.9 SDK still excludes a number of libSystem APIs that would be made available through the posix module when building with a newer SDK.

    That's something that would require some effort though to ensure that the resulting binary still works on older versions of macOS (basically similar to the work I've done in the post to weak link some other symbols in the posix module).

    @ned-deily
    Copy link
    Member

    {Note: this is not particularly relevant to the issue here.)

    Ronald:

    In the long run the macOS installers should be build using the latest SDK [...] That's something that would require some effort though to ensure that the resulting binary still works on older versions of macOS

    I agree that being able to build with the latest SDK would be nice but it's also true it would require effort on our part, both one-time and ongoing, at least for every new macOS SDK release and update to test with each older system. It would also require that the third-party libraries we build for an installer also behave correctly. And to make full use of it, third-party Python packages with extension modules would also need to behave correctly. I see one of the primary use cases for the python.org macOS installers as being for Python app developers who want to provide apps that run on a range of macOS releases. It seems to me that the safest and simplest way to guarantee that python.org macOS Pythons fulfill that need is to continue to always build them on the oldest supported system. Yes, that means that users may miss out on a few features only supported on the more recent macOS releases but I think that's the right trade-off until we have the resources to truly investigate and decide to support weak linking from current systems.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 13, 2018

    bpo-35219 is where I've run into this problem. I'm still trying to figure out all the details in my own case, but I can confirm that setting the environment variable does not always help.

    @warsaw warsaw added 3.7 (EOL) end of life 3.8 only security fixes labels Nov 13, 2018
    @warsaw warsaw changed the title High Sierra hang when using multi-processing macOS crashes after fork with no exec Nov 13, 2018
    @warsaw warsaw changed the title macOS crashes after fork with no exec Pytho crashes on macOS after fork with no exec Nov 13, 2018
    @warsaw warsaw changed the title Pytho crashes on macOS after fork with no exec Python crashes on macOS after fork with no exec Nov 14, 2018
    @warsaw
    Copy link
    Member

    warsaw commented Nov 14, 2018

    Hoo boy. I'm not sure I have the full picture, but things are starting to come into focus. After much debugging, I've narrowed down at least one crash to urllib.request.getproxies(). On macOS (darwin), this ends up calling _scproxy.get_proxies() which calls into the SystemConfiguration framework. I'll bet dollars to donuts that that calls into the ObjC runtime. Thus it is unsafe to call between fork and exec. This certainly seems to be the case even if the environment variable is set.

    The problem is that I think requests.post() probably also ends up in here somehow (still untraced), because by removing our call to urllib.requests.getproxies(), we just crash later on when requests.post() is called.

    I don't know what, if anything can be done in Python, except perhaps to document that anything that calls into the ObjC runtime between fork and exec can potentially crash the subprocess.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 14, 2018

    A few other things I don't understand:

    • Why does setting OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES only seem to work when it's set in the shell before the parent process executes? AFAICT, it does *not* work if you set that in os.environ in the parent process before the os.fork().

    • Why does it only crash on the first invocation of our app? Does getproxies() cache the results somehow? There's too much internal application code in the way to know if we're doing something that prevents getproxies() from getting called in subsequent calls.

    • I can't seem to produce a smaller test case.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 14, 2018

    FWIW, I suspect that setting the environment variable only helps if it's done before the process starts. You cannot set it before the fork and have it affect the child.

    @applio
    Copy link
    Member

    applio commented Nov 14, 2018

    Barry's effort as well as comments in other links seem to all suggest that OBJC_DISABLE_INITIALIZE_FORK_SAFETY is not comprehensive in its ability to make other threads "safe" before forking.

    "Objective-C classes defined by the OS frameworks remain fork-unsafe" (from @kapilt's first link) suggests we furthermore remain at risk using certain MacOS system libraries prior to any call to fork.

    "To guarantee that forking is safe, the application must not be running any threads at the point of fork" (from @kapilt's second link) is an old truth that we continue to fight with even when we know very well that it's the truth.

    For newly developed code, we have the alternative to employ spawn instead of fork to avoid these problems in Python, C, Ruby, etc. For existing legacy code that employed fork and now surprises us by failing-fast on MacOS 10.13 and 10.14, it seems we are forced to face a technical debt incurred back when the choice was first made to spin up threads and afterwards to use fork.

    If we didn't already have an "obvious" (zen of Python) way to avoid such problems with spawn versus fork, I would feel this was something to solve in Python. As to helping the poor unfortunate souls who must fight the good fight with legacy code, I am not sure what to do to help though I would like to be able to help.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 14, 2018

    Legacy code is easy to migrate as long as it uses Python 3. Just call

      mp.set_start_method('forkserver')

    at the top of your code and you're done. Some use cases may fail (if sharing non-picklable types), but they're probably not very common.

    @ned-deily
    Copy link
    Member

    _scproxy has been known to be problematic for some time, see for instance bpo-31818. That issue also gives a simple workaround: setting urllib's "no_proxy" environment variable to "*" will prevent the calls to the System Configuration framework.

    @applio
    Copy link
    Member

    applio commented Nov 14, 2018

    Given the original post mentioned 2.7.15, I wonder if it is feasible to fork near the beginning of execution, then maintain and pass around a multiprocessing.Pool to be used when needed instead of dynamically forking? Working with legacy code is almost always more interesting than you want it to be.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 14, 2018

    On Nov 14, 2018, at 10:11, Davin Potts <report@bugs.python.org> wrote:

    Davin Potts <python@discontinuity.net> added the comment:

    Barry's effort as well as comments in other links seem to all suggest that OBJC_DISABLE_INITIALIZE_FORK_SAFETY is not comprehensive in its ability to make other threads "safe" before forking.

    Right. Setting the env var will definitely not make it thread safe. My understanding (please correct me if I’m wrong!) isn’t that this env var makes it safe, just that it prevents the ObjC runtime from core dumping. So it’s still up to the developer to know whether threads are involved or not. In our cases, these are single threaded applications. I’ve read elsewhere that ObjC doesn’t care if threads have actually been spun up or not.

    "Objective-C classes defined by the OS frameworks remain fork-unsafe" (from @kapilt's first link) suggests we furthermore remain at risk using certain MacOS system libraries prior to any call to fork.

    Actually, it’s unsafe to call anything between fork and exec. Note that this doesn’t just affect Python; this is a pretty common idiom in other scripting languages too, from what I can tell. It’s certainly very common in Python.

    Note too that urllib.request.getproxies() will end up calling into the ObjC runtime via _scproxy, so you can’t even use requests after a fork but before exec.

    What I am still experimenting with is to see if I can define a pthread_atfork handler that will initialize the ObjC runtime before fork is actually called. I saw a Ruby approach like this, but it’s made more difficult in Python because pthread_atfork isn’t exposed to Python. I’m trying to see if I can implement it in ctypes, before I write an extension.

    "To guarantee that forking is safe, the application must not be running any threads at the point of fork" (from @kapilt's second link) is an old truth that we continue to fight with even when we know very well that it's the truth.

    True, but do realize this problem affects you even in single threaded applications.

    For newly developed code, we have the alternative to employ spawn instead of fork to avoid these problems in Python, C, Ruby, etc. For existing legacy code that employed fork and now surprises us by failing-fast on MacOS 10.13 and 10.14, it seems we are forced to face a technical debt incurred back when the choice was first made to spin up threads and afterwards to use fork.

    It’s tech debt you incur even if you don’t spin up threads. Just fork and do some work in the child before calling exec. If that work enters the ObjC runtime (as in the getproxies example), your child will coredump,

    If we didn't already have an "obvious" (zen of Python) way to avoid such problems with spawn versus fork, I would feel this was something to solve in Python. As to helping the poor unfortunate souls who must fight the good fight with legacy code, I am not sure what to do to help though I would like to be able to help.

    *If* we can provide a hook to initialize the ObjC runtime in pthread_atfork, I think that’s something we could expose in Python. Then we can say legacy code can just invoke that, and at least you will avoid the worst outcome.

    @warsaw
    Copy link
    Member

    warsaw commented Nov 15, 2018

    I have a reliable way to call *something* in the pthread_atfork prepare handler, but I honestly don't know what to call to prevent the crash.

    In the Ruby thread, it seemed to say that you could just dlopen /System/Library/Frameworks/Foundation.framework/Foundation but that does not work for me. Neither does also loading the CoreFoundation and SystemConfiguration frameworks.

    If anybody has something that will reliably initialize the runtime, I can post my approach (there are a few subtleties). Short of that, I think there's nothing that can be done except ensure that exec is called right after fork.

    @ronaldoussoren
    Copy link
    Contributor

    AFAIK there is nothing you can do between after calling fork(2) to "reinitialise" the ObjC runtime. And I don't think that's the issue anyway: I suspect that the actual problem is that Apple's system frameworks use multithreading (in particular libdispatch) and don't have code to ensure a sane state after calling fork.

    In Python 3 there is another workaround to avoid problems using multiprocessing: use multiprocessing.set_start_method() to switch away from the "fork" startup handler to "spawn" or "forkserver" (the latter only when calling set_start_method before calling any code that might call into Apple system frameworks.

    @ned-deily
    Copy link
    Member

    New changeset ac218bc by Ned Deily in branch 'master':
    bpo-33725: skip test_multiprocessing_fork on macOS (GH-11043)
    ac218bc

    @miss-islington
    Copy link
    Contributor

    New changeset d4bcf13 by Miss Islington (bot) in branch '3.7':
    bpo-33725: skip test_multiprocessing_fork on macOS (GH-11043)
    d4bcf13

    @miss-islington
    Copy link
    Contributor

    New changeset df5d884 by Miss Islington (bot) in branch '3.6':
    bpo-33725: skip test_multiprocessing_fork on macOS (GH-11043)
    df5d884

    @ned-deily
    Copy link
    Member

    Since it looks like multiprocessing_fork is not going to be fixable for macOS, the main issue remaining is how to help users avoid this trap (literally). Should we add a check and issues a warning or error at run time? Or is a doc change sufficient?

    In the meantime, I've merged changes to disable running test_multiprocessing_fork which will sometimes (but not always) segfault on 10.14 Mojave. I should apologize to Barry and others who have run into this. I did notice the occasional segfault when testing with Mojave just prior to its release but it wasn't always reproducible and I didn't follow up on it. Now that the change in 10.14 behavior makes this existing problem with fork no exec more obvious, it's clear that the test segfaults are another manifestation of this.

    @applio
    Copy link
    Member

    applio commented Dec 9, 2018

    Do we really need to disable the running of test_multiprocessing_fork entirely on MacOS?

    My understanding so far is that not *all* of the system libraries on the mac are spinning up threads and so we should expect that there are situations where fork alone may be permissible, but of course we don't yet know what those are. Pragmatically speaking, I have not yet seen a report of test_multiprocessing_fork tests triggering this problem but I would like to see/hear that when it is observed (that's my pitch for leaving the tests enabled).

    @applio
    Copy link
    Member

    applio commented Dec 9, 2018

    @ned.deily: Apologies, I misread what you wrote -- I would like to see the random segfaults that you were seeing on Mojave if you can still point me to a few.

    @warsaw warsaw closed this as completed May 29, 2020
    @ahmedsayeed1982 ahmedsayeed1982 mannequin added stdlib Python modules in the Lib dir 3.7 (EOL) end of life and removed OS-mac 3.8 only security fixes labels Nov 4, 2021
    @eryksun eryksun added OS-mac 3.8 only security fixes and removed stdlib Python modules in the Lib dir 3.7 (EOL) end of life labels Nov 4, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    mweinelt added a commit to mweinelt/nixpkgs that referenced this issue Oct 4, 2022
    We run pytest with `--forked` in nixpkgs, to reduce side effects that
    can occur when multiple tests mutate their environment in incompatible
    ways.
    
    Forking on macOS 10.13 and later is unsafe when an application does work
    between calls to fork() and its followup exec(). This may lead to
    crashes when calls into the Objective-C runtime are issued, which will
    in turn coredump the Python interpreter.
    
    One good reproducer for this scenario is when the urllib module tries
    to lookup proxy configurations in `urllib.request.getproxies()` through
    `get_proxies_macos_sysconf` into the native `_scproxy` module.
    
    This is a class of issues that is of course not limited to the urllib
    module. The general recommendation is to use `spawn` instead of `fork`,
    but we don't have any influence on upstream developers to do one or the
    other.
    
    One often cited workaround would be to disable fork safety entirely on
    calls to `initialize()`, which is probably a better solution than
    running without multithreading (slow) or without the `--forked` (prone
    to side effects) mode.
    
    This currently happens on aarch64-linux only, where we use more recent
    11.0 SDK version, while x86_64-darwin has been stuck on 10.12 for a
    while now.
    
    python/cpython#77906 (comment)
    http://www.sealiesoftware.com/blog/archive/2017/6/5/Objective-C_and_fork_in_macOS_1013.html
    
    Closes: NixOS#194290
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes OS-mac type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests