classification
Title: [macOS] _scproxy.get_proxies() crash -- get_proxies() is not fork-safe?
Type: crash Stage:
Components: macOS Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mirko Friedenhagen, barry, ned.deily, ronaldoussoren, vstinner
Priority: normal Keywords:

Created on 2017-10-19 10:30 by Mirko Friedenhagen, last changed 2018-11-14 20:44 by barry.

Files
File name Uploaded Description Edit
python2.7_2017-10-18-092216-1_lmka-2hpphfdty3.crash Mirko Friedenhagen, 2017-10-19 10:30 crash report with homebrew Python 2.7.14
Messages (8)
msg304614 - (view) Author: Mirko Friedenhagen (Mirko Friedenhagen) Date: 2017-10-19 10:30
The same bug which shows up in https://bugs.python.org/issue30837 is in both the System python provided by Apple (2.7.10) as well as the one coming via Homebrew (2.7.14) (See https://github.com/Homebrew/homebrew-core/blob/master/Formula/python.rb) for build instructions.

The culprit is the same as before, `_scproxy`. Setting the environment variable `no_proxy` did the trick for me as well. I attached the crash report
msg304615 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-19 12:36
Hi, can you please explain how to reproduce your issue?

According to the crash report, it seems like you are running Ansible on macOS and that the Python function _scproxy.get_proxies() was called.

get_proxies() calls CFPreferencesCopyAppValue() which calls indirectly performForkChildInitialize(). It seems like Ansible forked the process or something like that. Finally, performForkChildInitialize() calls _objc_fatal() which kills the process with abort().

The parent process is also Python ("Parent Process: python2.7 [4305]") which confirms that the application used fork().

See also:

* bpo-9405: Similar but old (2010) crash caused by SCDynamicStoreCopyProxies in a small Python application using multiprocessing and so using fork
* bpo-27126: "Apple-supplied libsqlite3 on OS X is not fork safe;  can cause crashes"
* "fork() without exec() is dangerous in large programs" article by Evan Jones (2016-August-16): http://www.evanjones.ca/fork-is-dangerous.html -- this article mentions bpo-27126

Ned Deily's advice from bpo-9405: "A quick workaround is to make a [get_proxies()] call from the main process."

IMHO the safest fix is to not run any Python program after fork(). Using use subprocess to use fork() immmediately followed by exec(). It's not safe to execute code after fork(), many functions are not "fork safe". But I know that many applications don't care, since there is also a lot of functions which are fork safe...
msg304616 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-19 12:41
Confirmation from Apple:

https://developer.apple.com/library/content/technotes/tn2083/_index.html#//apple_ref/doc/uid/DTS10003794-CH1-SUBSECTION52

"""
Many Mac OS X frameworks do not work reliably if you call fork but do not call exec. The only exception is the System framework and, even there, the POSIX standard places severe constraints on what you can do between a fork and an exec.
(...)
Listing 13  Core Foundation complaining about fork-without-exec

The process has forked and you cannot use this CoreFoundation \
functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_\
COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
"""
msg304617 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-19 12:41
@Mirko: You can please try to get the Python traceback of the Ansible crash? You may want to try faulthandler: enable it and write its output into a file.
https://faulthandler.readthedocs.io/
msg304621 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-10-19 17:16
Ronald is the expert on this but, from what I understand, I don't think there is any reason to spend time on trying to further analyze this.  This issue has been around since day one of _scproxy and affects all versions of Python on macOS.  There is nothing we can do to fix it, and, after all these years, it isn't likely that Apple is going to change the underlying framework.  What we could do is: (1) better document the restriction; (2) find another way to access the system's network proxy configuration (not likely), or (3) change how we use the System Configuration framework, i.e. either don't call it at all or don't call it by default (but that seems like overfill for an edge case for which there already is a fairly simple workaround).  Ronald, what do you think?
msg304635 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-19 18:44
Another workaround is to call get_proxies() in a fresh subprocess, and use
a pipe to retrieve the result.
msg304648 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2017-10-20 07:52
Calling get_proxies() in a subprocess would also work, although I'd then prefer to use a small daemon proces to avoid the startup cost of a new process for every call to _scproxy functions.

There is a conflict between two goals w.r.t. the macOS port:

1) Integrate nicely with the platform

2) Be like other unixy platforms

The former requires the use of Apple specific APIs, like those used in _scproxy, but those cause problems when using fork without calling exec.

The latter is technically an issue for all processing using threads on POSIX systems (see <http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html>), AFAIK users get bitten by this more on macOS because Apple appears to use threading in their implementation (making processes multi-threaded without explicitly using threading in user code), and because Apple explicitly checks for the "fork without exec" case and crashes the child proces.

This can of course also be seen as a qualify of implementation issue on macOS, as in "Apple can't be bothered to do the work to support this use case" ;-/

Anyways: As Ned writes this is unlikely to change on Apple's side and we have to life with that.

There's three options going forward:

1) Remove _scproxy

I'm -1 on that because I'm in favour of having good platform integration, Python's URL fetching APIs should transparently use the system proxy config configuration.

2) Document this problem and move on

3) Find a workaround (such as calling the APIs used by _scproxy in a clean supprocess).

The 3th option is probably the most useful in the long run, but requires someone willing to do the work.  I'm in principle willing to do the work, but haven't had enough free time to work on CPython for quite a while now (which saddens me, but that's off topic).
msg304649 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-20 08:59
> 3) Find a workaround (such as calling the APIs used by _scproxy in a clean supprocess).

I dislike the idea of *always* spawning a child process, I prefer to leave it as it is, but add a recipe in the doc, or add a new helper function doing that.

Spawning a subprocess can have side effects as well, whereas the subprocess is only need if you call the function after forking which is not the most common pattern in Python.
History
Date User Action Args
2018-11-14 20:44:51barrysetversions: + Python 3.6, Python 3.7, Python 3.8
2018-11-14 20:44:42barrysetnosy: + barry
2017-10-20 08:59:33vstinnersetmessages: + msg304649
2017-10-20 07:52:44ronaldoussorensetmessages: + msg304648
2017-10-19 18:44:49vstinnersetmessages: + msg304635
2017-10-19 17:16:12ned.deilysetmessages: + msg304621
2017-10-19 12:41:52vstinnersetmessages: + msg304617
2017-10-19 12:41:12vstinnersetmessages: + msg304616
2017-10-19 12:36:40vstinnersetnosy: + vstinner

messages: + msg304615
title: macOS HighSierra final - Python Crash because of _scproxy -> [macOS] _scproxy.get_proxies() crash -- get_proxies() is not fork-safe?
2017-10-19 10:30:35Mirko Friedenhagencreate