msg256815 - (view) |
Author: A. Jesse Jiryu Davis (emptysquare) * |
Date: 2015-12-21 23:19 |
On some platforms there's an exclusive lock in socketmodule, used for getaddrinfo, gethostbyname, gethostbyaddr. A thread can hold this lock while another forks, leaving it locked forever in the child process. Calls to these functions in the child process will hang.
(I wrote some more details here: https://emptysqua.re/blog/getaddrinfo-deadlock/ )
I propose that this is a bug, and that it can be fixed in PyOS_AfterFork, where a few similar locks are already reset.
|
msg256817 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2015-12-21 23:55 |
Maybe instead of releasing the lock in the forked child process, we should try to acquire the lock in the os.fork() implementation, and then release it?
Otherwise, suppose that a call to getaddrinfo (call #1) takes a long amount of time. In the middle of it we fork, and then immediately try to call getaddrinfo (call #2, and call #1 is still happening) for some other address. At this point, since getaddrinfo isn't threadsafe, something bad will happen.
|
msg256920 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2015-12-23 15:47 |
#25924 is related to this, I filed this after reading the blog post. The lock might not be necessary on OSX, and possibly on the other systems as well.
Yury: resetting the lock in the child should be safe because after the fork the child only has a single thread that is returning from fork(2). The thread that acquired the lock does not exist in the child process.
|
msg368876 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2020-05-14 23:14 |
Does the example code (which should be posted here) still hang?
If so, automated tests that hang indefinitely on failure are a nuisance. A revised example that failed after, say, a second would be better.
|
msg368881 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-14 23:42 |
> Maybe instead of releasing the lock in the forked child process, we should try to acquire the lock in the os.fork() implementation, and then release it?
In bpo-40089, I added _PyThread_at_fork_reinit() for this purpose: reinitialize a lock after a fork to unlocked state. Internally, it leaks memory on purpose and then create a new lock, since there is no portable way to reset a lock after fork.
The problem is how to register netdb_lock of Modules/socketmodule.c into a list of locks which should be reinitialized at fork, or maybe how to register a C callback called at fork. There is a *Python* API to register a callback after a fork: os.register_at_fork().
See also the meta-issue bpo-6721: "Locks in the standard library should be sanitized on fork".
|
msg368883 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-14 23:48 |
> (I wrote some more details here: https://emptysqua.re/blog/getaddrinfo-deadlock/ )
On macOS, Python is only affected if "MAC_OS_X_VERSION_MIN_REQUIRED < MAC_OS_X_VERSION_10_5". Is it still the case in 2020?
Copy/paste of socketmodule.c:
/* On systems on which getaddrinfo() is believed to not be thread-safe,
(this includes the getaddrinfo emulation) protect access with a lock.
getaddrinfo is thread-safe on Mac OS X 10.5 and later. Originally it was
a mix of code including an unsafe implementation from an old BSD's
libresolv. In 10.5 Apple reimplemented it as a safe IPC call to the
mDNSResponder process. 10.5 is the first be UNIX '03 certified, which
includes the requirement that getaddrinfo be thread-safe. See issue #25924.
It's thread-safe in OpenBSD starting with 5.4, released Nov 2013:
http://www.openbsd.org/plus54.html
It's thread-safe in NetBSD starting with 4.0, released Dec 2007:
http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/net/getaddrinfo.c.diff?r1=1.82&r2=1.83
*/
#if ((defined(__APPLE__) && \
MAC_OS_X_VERSION_MIN_REQUIRED < MAC_OS_X_VERSION_10_5) || \
(defined(__FreeBSD__) && __FreeBSD_version+0 < 503000) || \
(defined(__OpenBSD__) && OpenBSD+0 < 201311) || \
(defined(__NetBSD__) && __NetBSD_Version__+0 < 400000000) || \
!defined(HAVE_GETADDRINFO))
#define USE_GETADDRINFO_LOCK
#endif
|
msg369031 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2020-05-16 09:56 |
The macOS test checks if the binary targets macOS 10.4 or earlier. Those versions of macOS have been out of support for a very long time, and we haven't had installers targeting those versions of macOS for a long time as well. 2.7 and 3.5 had installers targeting macOS 10.5, current installers target macOS 10.9.
IMHO macOS 10.4 has moved into museum territory and I wouldn't bother supporting it anymore.
Support for USE_GETADDRINFO_LOCK is only enabled for very old OS releases, the OS that stopped requiring this the latest is OpenBSD in 2013 (7 years ago). The other OSes stopped requiring this in code in 2007 (13 years ago).
I'd drop this code instead of fixing it.
|
msg369037 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-16 10:15 |
> I'd drop this code instead of fixing it.
Hum, FreeBSD, OpenBSD and NetBSD versions which require the fix also look very old. So I agree that it became safe to remove the fix.
Would it make sense to only fix it on Python 3.10 and leave other versions with the bug? Or should fix all Python versions?
|
msg369116 - (view) |
Author: Ronald Oussoren (ronaldoussoren) * |
Date: 2020-05-17 12:18 |
Technically this would be a functional change, I'd drop this code in 3.9 and trunk (although it is awfully close to the expected date for 3.9b1).
Older versions would keep this code and the bug, that way the older python versions can still be used on these ancient OS versions (but users might run into this race condition).
|
msg369220 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-18 13:46 |
I wrote PR 20177 to avoid the netdb_lock in socket.getaddrinfo(), but the lock is still used on platforms which don't provide gethostbyname_r():
#if !defined(HAVE_GETHOSTBYNAME_R) && !defined(MS_WINDOWS)
# define USE_GETHOSTBYNAME_LOCK
#endif
|
msg370227 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-28 15:23 |
New changeset 0de437de6210c2b32b09d6c47a805b23d023bd59 by Victor Stinner in branch 'master':
bpo-25920: Remove socket.getaddrinfo() lock on macOS (GH-20177)
https://github.com/python/cpython/commit/0de437de6210c2b32b09d6c47a805b23d023bd59
|
msg370229 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-28 15:37 |
If I understood correctly, Python 3.8 and 3.9 binaries provided by python.org is *not* impacted by this issue.
Only Python binaries built manually with explicit support for macOS 10.4 ("MAC_OS_X_VERSION_MIN_REQUIRED") were impacted.
Python 3.9 and older are not fixed (keep the lock). The workaround is to require macOS 10.5 or newer. macOS 10.4 was released in 2004, it's maybe time to stop support it :-)
Python 3.7 (and newer) requires macOS 10.6 or newer (again, I'm talking about binaries provided by python.org).
> bpo-25920: Remove socket.getaddrinfo() lock on macOS (GH-20177)
I chose to leave the lock for gethostbyname(). Ronald wrote that this lock is no longer needed:
"As an aside (not to be addressed in the PR): Apparently gethostbyname() and related functions are thread-safe on macOS. This is according to the manpage on macOS 10.15. I haven't checked in which version that changed. This allows avoiding the use of the gethostbyname lock as well."
https://github.com/python/cpython/pull/20177#pullrequestreview-418909595
Please open a separated issue for this lock.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:25 | admin | set | github: 70108 |
2020-05-28 15:37:32 | vstinner | set | status: open -> closed versions:
+ Python 3.10, - Python 3.9 messages:
+ msg370229
resolution: fixed stage: patch review -> resolved |
2020-05-28 15:23:47 | vstinner | set | messages:
+ msg370227 |
2020-05-18 13:46:37 | vstinner | set | messages:
+ msg369220 |
2020-05-18 13:40:27 | vstinner | set | keywords:
+ patch stage: needs patch -> patch review pull_requests:
+ pull_request19476 |
2020-05-17 12:18:46 | ronaldoussoren | set | messages:
+ msg369116 |
2020-05-16 10:15:41 | vstinner | set | messages:
+ msg369037 |
2020-05-16 09:56:51 | ronaldoussoren | set | messages:
+ msg369031 |
2020-05-15 14:20:18 | hugh | set | nosy:
+ hugh
|
2020-05-14 23:48:02 | vstinner | set | messages:
+ msg368883 |
2020-05-14 23:42:29 | vstinner | set | messages:
+ msg368881 |
2020-05-14 23:14:40 | terry.reedy | set | nosy:
+ terry.reedy
messages:
+ msg368876 versions:
+ Python 3.9, - Python 3.4, Python 3.5, Python 3.6 |
2015-12-23 15:47:29 | ronaldoussoren | set | nosy:
+ ronaldoussoren messages:
+ msg256920
|
2015-12-21 23:57:17 | yselivanov | set | nosy:
+ serhiy.storchaka
|
2015-12-21 23:55:27 | yselivanov | set | messages:
+ msg256817 |
2015-12-21 23:36:24 | yselivanov | set | versions:
+ Python 3.4, Python 3.5, Python 3.6 nosy:
+ vstinner, yselivanov
components:
+ Interpreter Core type: behavior stage: needs patch |
2015-12-21 23:29:43 | ionelmc | set | nosy:
+ ionelmc
|
2015-12-21 23:19:32 | emptysquare | create | |