This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Acquiring locks not interrupted by signals on musl libc
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Joseph Sible, benjamin.peterson, miss-islington
Priority: normal Keywords: patch

Created on 2018-06-29 21:43 by Joseph Sible, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 9224 merged benjamin.peterson, 2018-09-12 19:48
PR 9226 merged miss-islington, 2018-09-12 20:48
PR 9227 merged miss-islington, 2018-09-12 20:48
Messages (8)
msg320742 - (view) Author: Joseph Sible (Joseph Sible) Date: 2018-06-29 21:43
When Python is built on Alpine Linux or in any other configuration that uses musl libc, calls to Lock.acquire() can't be interrupted by signals. This bug is caught by test_lock_acquire_interruption and test_rlock_acquire_interruption in Lib/test/test_threadsignals.py, both of which fail when Python is built on musl.

POSIX explicitly says that sem_timedwait ever failing with EINTR is optional:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/sem_timedwait.html#tag_16_508_05

And musl deliberately chooses not to support it:
http://www.openwall.com/lists/musl/2018/02/24/3
http://git.musl-libc.org/cgit/musl/commit/?id=c0ed5a201b2bdb6d1896064bec0020c9973db0a1

However, we incorrectly rely on it for correct operation. Our documentation https://docs.python.org/3.6/library/threading.html#threading.Lock.acquire just says that "Lock acquires can now be interrupted by signals on POSIX", not that any optional POSIX features are required for it.

A similar bug was #11223, which was the same problem but with pthread_cond_timedwait, which is used when semaphores aren't available.
msg325178 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-09-12 20:48
New changeset 5b10d5111d7a855297654af9045f8907b7d3dd08 by Benjamin Peterson in branch 'master':
closes bpo-34004: Skip lock interruption tests on musl. (GH-9224)
https://github.com/python/cpython/commit/5b10d5111d7a855297654af9045f8907b7d3dd08
msg325183 - (view) Author: miss-islington (miss-islington) Date: 2018-09-12 21:10
New changeset b608fcd444c00ff37a19d34e4eeadb1221fb6436 by Miss Islington (bot) in branch '3.7':
closes bpo-34004: Skip lock interruption tests on musl. (GH-9224)
https://github.com/python/cpython/commit/b608fcd444c00ff37a19d34e4eeadb1221fb6436
msg325184 - (view) Author: miss-islington (miss-islington) Date: 2018-09-12 21:11
New changeset 5a435eac1b83f080c9dfceff0de0d639541e4bcb by Miss Islington (bot) in branch '3.6':
closes bpo-34004: Skip lock interruption tests on musl. (GH-9224)
https://github.com/python/cpython/commit/5a435eac1b83f080c9dfceff0de0d639541e4bcb
msg325200 - (view) Author: Joseph Sible (Joseph Sible) Date: 2018-09-12 22:29
How is this considered "fixed"? Why couldn't this be actually fixed by using eventfd instead of semaphores when they're available, for example?
msg325202 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-09-12 22:36
Reimplementing locks with eventfd can be another issue. Such a change can only land in a new Python version, though. We'll just have to consider musl unsupported for interrupting locks in our current maintenance releases as I have done.

How likely is it that musl will change to allow returning EINTR from sem_timedwait given that the only stated reason not to is very old kernel versions?
msg325208 - (view) Author: Joseph Sible (Joseph Sible) Date: 2018-09-12 23:03
Re musl changing their behavior, see https://www.openwall.com/lists/musl/2018/09/07/1 and the resulting thread.

In addition to the old kernel version issue, two other issues were raised:
1. EINTR makes programming mistakes more likely, as people won't think to handle it. I don't give much weight to this point.
2. Most of the time, counting on receiving an EINTR results in race conditions. Our code seems to be affected by this too. Even on glibc, a signal at just the "right" time could result in it not being interrupted. This is why I think moving to an eventfd or something would be better, since we could then use pselect/ppoll/etc. to avoid the signal blocking race.
msg325211 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-09-12 23:33
On Wed, Sep 12, 2018, at 16:03, Joseph Sible wrote:
> 2. Most of the time, counting on receiving an EINTR results in race 
> conditions. Our code seems to be affected by this too. Even on glibc, a 
> signal at just the "right" time could result in it not being 
> interrupted. This is why I think moving to an eventfd or something would 
> be better, since we could then use pselect/ppoll/etc. to avoid the 
> signal blocking race.

The race is a good point.

However, switching to eventfd is not trivial. It would be the third locking implementation in python just for pthreads. Using eventfd would require reimplementing all the userspace parts of locking (atomics, appropriate memory barriers, and spinning) ourselves. futex is carefully optimized for userspace locking, whereas I've never heard of normal program locking being done with eventfd before.

If the expected usecase for interrupting locks is user pressing C-c, being 99.99999% correct is probably enough.

So, it's fine with me if you open another issue for reimplementing locks with eventfd. But, I don't personally have the desire to do it.
History
Date User Action Args
2022-04-11 14:59:02adminsetgithub: 78185
2018-09-12 23:33:39benjamin.petersonsetmessages: + msg325211
2018-09-12 23:03:35Joseph Siblesetmessages: + msg325208
2018-09-12 22:36:44benjamin.petersonsetmessages: + msg325202
2018-09-12 22:29:46Joseph Siblesetmessages: + msg325200
2018-09-12 21:11:48miss-islingtonsetmessages: + msg325184
2018-09-12 21:10:59miss-islingtonsetnosy: + miss-islington
messages: + msg325183
2018-09-12 20:48:22miss-islingtonsetpull_requests: + pull_request8658
2018-09-12 20:48:15miss-islingtonsetpull_requests: + pull_request8657
2018-09-12 20:48:07benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg325178

resolution: fixed
stage: patch review -> resolved
2018-09-12 19:48:37benjamin.petersonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8655
2018-06-29 21:43:37Joseph Siblecreate