classification
Title: test_multiprocessing_fork: test_semaphore_tracker_sigint() fails with -W error
Type: Stage: resolved
Components: Tests Versions: Python 3.8
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: pablogsal Nosy List: davin, pablogsal, pitrou, serhiy.storchaka, taleinat, vstinner
Priority: normal Keywords: patch

Created on 2018-05-23 10:04 by vstinner, last changed 2018-10-10 08:18 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7850 merged pablogsal, 2018-06-21 21:40
PR 8500 closed vstinner, 2018-07-27 11:07
PR 9055 closed miss-islington, 2018-09-04 08:56
PR 9778 merged pablogsal, 2018-10-09 20:59
Messages (31)
msg317372 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-23 10:04
test_semaphore_tracker_sigint() emits a warning. If the test is run with -W error, the test fails.

vstinner@apu$ ./python -m test test_multiprocessing_fork -v -m test_semaphore_tracker_sigint
(...)
test_semaphore_tracker_sigint (test.test_multiprocessing_fork.TestSemaphoreTracker) ... /home/vstinner/prog/python/master/Lib/multiprocessing/semaphore_tracker.py:55: UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.
  warnings.warn('semaphore_tracker: process died unexpectedly, '
ok
(...)
Tests result: SUCCESS

vstinner@apu$ ./python  -Werror  -m test test_multiprocessing_fork -v -m test_semaphore_tracker_sigint
(...)
======================================================================
ERROR: test_semaphore_tracker_sigint (test.test_multiprocessing_fork.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vstinner/prog/python/master/Lib/test/_test_multiprocessing.py", line 4533, in test_semaphore_tracker_sigint
    self.check_semaphore_tracker_death(signal.SIGINT, False)
  File "/home/vstinner/prog/python/master/Lib/test/_test_multiprocessing.py", line 4521, in check_semaphore_tracker_death
    sem = ctx.Semaphore()
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/context.py", line 82, in Semaphore
    return Semaphore(value, ctx=self.get_context())
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/synchronize.py", line 127, in __init__
    SemLock.__init__(self, SEMAPHORE, value, SEM_VALUE_MAX, ctx=ctx)
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/synchronize.py", line 81, in __init__
    register(self._semlock.name)
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/semaphore_tracker.py", line 83, in register
    self._send('REGISTER', name)
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/semaphore_tracker.py", line 90, in _send
    self.ensure_running()
  File "/home/vstinner/prog/python/master/Lib/multiprocessing/semaphore_tracker.py", line 55, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.
(...)
Tests result: FAILURE
msg317374 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-23 10:08
Note: I found this bug while working on bpo-33612.
msg317378 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-23 10:19
(Oops, I attached a file and added two comments to this issue, whereas I wanted to comment bpo-33612.)
msg317380 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-05-23 10:32
See also some information about this in issue31463.
msg320062 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-20 10:40
I would like to work on this issue and on issue31463
msg320063 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-20 10:42
> I would like to work on this issue and on issue31463

Say it on issue31463 as well on that case ;-)
msg320113 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-20 21:00
There are two problems in test_semaphore_tracker_sigint. One of the problems is that if `should_die` is False, the code does not check
that no warning is raised (and the warning is being raised, so this
the error has pass silently.) The other problem is a race condition
between the parent calls `_semaphore_tracker.ensure_running()` (which in turn spawns the semaphore_tracker using `_posixsubprocess.fork_exec`) and the child registering the signals. So basically, the parent sends the signal before the child protects against the signal. Modifying:

        _semaphore_tracker.ensure_running()
        pid = _semaphore_tracker._pid
        os.kill(pid, signum)
        time.sleep(1.0)  # give it time to die

to

        _semaphore_tracker.ensure_running()
        pid = _semaphore_tracker._pid
        time.sleep(1.0) # Give time for the child to register the signal handlers
        os.kill(pid, signum)
        time.sleep(1.0)  # give it time to die

fix the issue. I have tested this on one of the most grumpy and slow buildbots (gcc110) and it works.

I cannot think of a different way of waiting for the child to register the signals without modifying the child code for testing so I am not sure that we can do anything rather than the sleep.
msg320364 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-06-24 07:49
You can know that the semaphore tracker is running by sending it an invalid command and seeing the error written to stderr.  The semaphore tracker either inherits sys.stderr from the parent process (if possible) or uses its own.  This is a bit of a hack, but might be worth it to avoid another wasted second on every run of the test suite.
msg320370 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 13:55
@taleinat Sadly, if the semaphore_tracker receives an invalid command, it raises and dies, making the test impossible:

                     cmd, name = line.strip().split(b':')
                     if cmd == b'REGISTER':
                         cache.add(name)
                     elif cmd == b'UNREGISTER':
                         cache.remove(name)
                     else:
                         raise RuntimeError('unrecognized command %r' % cmd)
msg320371 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-06-24 13:57
We can easily add a 'NOOP' command to the semaphore tracker if it helps solve that issue.
msg320372 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 14:01
Sorry, I was wrong, it does not die. I am investigating this solution using stderr.
msg320373 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 14:02
Additionally, as I commented in the PR, apart from the race condition I think the test should be modified so it fails if the warning is raised and not expected:

>The problem is that the test was silently failing when SIGINT was >being delivered as the test does not check that no warnings are >raised. Notice that the presence of a warning means that the process >died and therefore SIGINT did kill the process, which is precisely >what the test checks that does not happen.

>I think in order to check that the test works as intended without the >need to run the suite with -Wall, we need to modify the test.
msg320374 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 14:06
@pitrou I think that should be ideal. The reason is that if we send an invalid command the cache is not cleared and the semaphore tracker will warn when the process finishes which is precisely what the test is avoiding.
msg320376 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-06-24 17:26
For the "no-op" we'll need a way to know that it was processed.  To that end, I'd make the semaphore tracker recognize a new command e.g. "PING", upon which it would write something e.g. "PONG" into its end of the pipe.
msg320379 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 18:28
I have updated the PR implementing a PONG command in the semaphore and updating the test. I have tested on the failing buildbots (gcc110.fsffrance.org) and it works. Notice that a new pipe for the child to write was needed as the child is continuously reading from the original one and therefore we cannot use the same one.
msg320381 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-06-24 18:49
Is it possible to notify the ping / noop reply on stderr instead of introducing a new pipe?
msg320382 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 19:39
The process seems to use its own stderr and it cannot be captured from the parent without modifying it AFAIK.
msg320384 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-24 20:31
I have modified the PR to avoid creating an extra pipe in the tracker (the pipe is created as part of the test). To allow testing with the -R option:

./python -m test test_multiprocessing_fork -v -m test_semaphore_tracker_sigint -R 3:3

I had to assure that we restart the tracker in the test so it does not reuse a closed file descriptor.
msg320614 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-27 20:31
See also bpo-31687.
msg324564 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-04 08:53
New changeset ec74d187f50a8a48f94eb37023300583fbd644cc by Antoine Pitrou (Pablo Galindo) in branch 'master':
bpo-33613, test_semaphore_tracker_sigint: fix race condition (#7850)
https://github.com/python/cpython/commit/ec74d187f50a8a48f94eb37023300583fbd644cc
msg324565 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-04 08:56
IMHO the fix should be backported to Python 3.6 and 3.7.

It cannot be backported to 2.7 since 2.7 doesn't have signal.pthread_sigmask().
msg324567 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-04 09:01
As I wrote on Github: no user-visible bug is fixed here, and we shouldn't risk introducing regressions by backporting those changes.

If buildbots hurt, I suggest skipping the tests on the buildbots.
msg324568 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-04 09:12
> If buildbots hurt, I suggest skipping the tests on the buildbots.

Let's do that. Pablo: do you want to write a PR to always skip TestSemaphoreTracker with a reference to this issue? Example: skipIf(True, "bpo-33613: the test has a race condition").
msg324569 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-04 09:13
Please only skip those tests on buildbots. They work fine otherwise.
msg324570 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-04 09:16
> Please only skip those tests on buildbots.

There is no easy wait to only skip a test on buildbots.

> They work fine otherwise.

The race condition impacts everyone. It's just less likely if your computer is fast enough. The bug is much more likely if you "force" the bad path:

diff --git a/Lib/multiprocessing/semaphore_tracker.py b/Lib/multiprocessing/semaphore_tracker.py
index 3b50a46ddc..7261b43725 100644
--- a/Lib/multiprocessing/semaphore_tracker.py
+++ b/Lib/multiprocessing/semaphore_tracker.py
@@ -107,6 +107,8 @@ getfd = _semaphore_tracker.getfd
 def main(fd):
     '''Run semaphore tracker.'''
     # protect the process from ^C and "killall python" etc
+    import time
+    time.sleep(0.5)
     signal.signal(signal.SIGINT, signal.SIG_IGN)
     signal.signal(signal.SIGTERM, signal.SIG_IGN)
 
diff --git a/Lib/test/_test_multiprocessing.py b/Lib/test/_test_multiprocessing.py
index 5c625dd495..6f9f7583e2 100644
--- a/Lib/test/_test_multiprocessing.py
+++ b/Lib/test/_test_multiprocessing.py
@@ -4193,7 +4193,6 @@ class TestSemaphoreTracker(unittest.TestCase):
         _multiprocessing.sem_unlink(name1)
         p.terminate()
         p.wait()
-        time.sleep(2.0)
         with self.assertRaises(OSError) as ctx:
             _multiprocessing.sem_unlink(name2)
         # docs say it should be ENOENT, but OSX seems to give EINVAL
msg324571 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-04 09:17
Le 04/09/2018 à 11:16, STINNER Victor a écrit :
> 
>> They work fine otherwise.
> 
> The race condition impacts everyone. It's just less likely if your computer is fast enough. The bug is much more likely if you "force" the bad path:

Which bug? It's only a failing test.
msg324572 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-04 09:17
By the way, "no user-visible bug is fixed here": I agree that it's not easy to trigger manually the bug (when pressing CTRL+C), but I don't see why an user couldn't hit this bug. The race condition is now obvious to me.
msg324573 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-09-04 09:18
I don't want to spend hours arguing.  This issue is sufficiently rare and unlikely in normal conditions that I don't think we should risk regressions by trying to fix it.
msg324574 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-09-04 09:18
> Which bug? It's only a failing test.

Linux vendors run the full test suite. If the test suite fails, the build of the package fails as well. It's annoying.
msg327244 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-06 13:22
After this change tests are failed when ran with -Werror.

$ ./python -Werror -m test -vuall -m 'test_semaphore_tracker_sig*' test_multiprocessing_fork test_multiprocessing_forkserver test_multiprocessing_spawn
...
======================================================================
ERROR: test_semaphore_tracker_sigkill (test.test_multiprocessing_fork.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4562, in test_semaphore_tracker_sigkill
    self.check_semaphore_tracker_death(signal.SIGKILL, True)
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4526, in check_semaphore_tracker_death
    _semaphore_tracker.ensure_running()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 64, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.

----------------------------------------------------------------------
...
======================================================================
ERROR: test_semaphore_tracker_sigint (test.test_multiprocessing_forkserver.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4554, in test_semaphore_tracker_sigint
    self.check_semaphore_tracker_death(signal.SIGINT, False)
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4526, in check_semaphore_tracker_death
    _semaphore_tracker.ensure_running()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 64, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.

======================================================================
ERROR: test_semaphore_tracker_sigkill (test.test_multiprocessing_forkserver.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4562, in test_semaphore_tracker_sigkill
    self.check_semaphore_tracker_death(signal.SIGKILL, True)
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4534, in check_semaphore_tracker_death
    sem = ctx.Semaphore()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/context.py", line 82, in Semaphore
    return Semaphore(value, ctx=self.get_context())
  File "/home/serhiy/py/cpython/Lib/multiprocessing/synchronize.py", line 126, in __init__
    SemLock.__init__(self, SEMAPHORE, value, SEM_VALUE_MAX, ctx=ctx)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/synchronize.py", line 80, in __init__
    register(self._semlock.name)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 104, in register
    self._send('REGISTER', name)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 111, in _send
    self.ensure_running()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 64, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.

----------------------------------------------------------------------
...
======================================================================
ERROR: test_semaphore_tracker_sigint (test.test_multiprocessing_spawn.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4554, in test_semaphore_tracker_sigint
    self.check_semaphore_tracker_death(signal.SIGINT, False)
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4526, in check_semaphore_tracker_death
    _semaphore_tracker.ensure_running()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 64, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.

======================================================================
ERROR: test_semaphore_tracker_sigkill (test.test_multiprocessing_spawn.TestSemaphoreTracker)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4562, in test_semaphore_tracker_sigkill
    self.check_semaphore_tracker_death(signal.SIGKILL, True)
  File "/home/serhiy/py/cpython/Lib/test/_test_multiprocessing.py", line 4534, in check_semaphore_tracker_death
    sem = ctx.Semaphore()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/context.py", line 82, in Semaphore
    return Semaphore(value, ctx=self.get_context())
  File "/home/serhiy/py/cpython/Lib/multiprocessing/synchronize.py", line 126, in __init__
    SemLock.__init__(self, SEMAPHORE, value, SEM_VALUE_MAX, ctx=ctx)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/synchronize.py", line 80, in __init__
    register(self._semlock.name)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 104, in register
    self._send('REGISTER', name)
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 111, in _send
    self.ensure_running()
  File "/home/serhiy/py/cpython/Lib/multiprocessing/semaphore_tracker.py", line 64, in ensure_running
    warnings.warn('semaphore_tracker: process died unexpectedly, '
UserWarning: semaphore_tracker: process died unexpectedly, relaunching.  Some semaphores might leak.

----------------------------------------------------------------------

And I'm puzzled by the line

    issubclass(the_warn.category, UserWarning)
msg327459 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-10-10 07:40
New changeset 3058b7d85697f95573fa042d6b9e4d6e2a9e739c by Pablo Galindo in branch 'master':
bpo-33613: Fix test_semaphore_tracker signal tests when using -Werror (GH-9778)
https://github.com/python/cpython/commit/3058b7d85697f95573fa042d6b9e4d6e2a9e739c
History
Date User Action Args
2018-10-10 08:18:02pablogsalsetstatus: open -> closed
stage: patch review -> resolved
2018-10-10 07:40:23pablogsalsetmessages: + msg327459
2018-10-09 20:59:32pablogsalsetstage: needs patch -> patch review
pull_requests: + pull_request9164
2018-10-06 13:22:29serhiy.storchakasetstatus: closed -> open
resolution: fixed ->
messages: + msg327244

stage: resolved -> needs patch
2018-09-19 23:25:25vstinnersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-09-04 09:18:49vstinnersetmessages: + msg324574
2018-09-04 09:18:41pitrousetmessages: + msg324573
2018-09-04 09:17:57vstinnersetmessages: + msg324572
2018-09-04 09:17:21pitrousetmessages: + msg324571
2018-09-04 09:16:36vstinnersetmessages: + msg324570
2018-09-04 09:13:22pitrousetmessages: + msg324569
2018-09-04 09:12:33vstinnersetmessages: + msg324568
2018-09-04 09:01:34pitrousetmessages: + msg324567
versions: - Python 3.6, Python 3.7
2018-09-04 08:56:58vstinnersetmessages: + msg324565
versions: + Python 3.6, Python 3.7
2018-09-04 08:56:19miss-islingtonsetpull_requests: + pull_request8515
2018-09-04 08:53:59pitrousetmessages: + msg324564
2018-07-27 11:07:48vstinnersetpull_requests: + pull_request8018
2018-06-27 20:31:48vstinnersetmessages: + msg320614
2018-06-24 20:31:50pablogsalsetmessages: + msg320384
2018-06-24 19:39:57pablogsalsetmessages: + msg320382
2018-06-24 18:49:48pitrousetmessages: + msg320381
2018-06-24 18:28:53pablogsalsetmessages: + msg320379
2018-06-24 17:26:41taleinatsetmessages: + msg320376
2018-06-24 14:06:31pablogsalsetmessages: + msg320374
2018-06-24 14:02:45pablogsalsetmessages: + msg320373
2018-06-24 14:01:11pablogsalsetmessages: + msg320372
2018-06-24 13:57:32pitrousetmessages: + msg320371
2018-06-24 13:55:43pablogsalsetmessages: + msg320370
2018-06-24 07:49:35taleinatsetnosy: + taleinat
messages: + msg320364
2018-06-21 21:40:15pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request7461
2018-06-20 21:00:30pablogsalsetmessages: + msg320113
2018-06-20 10:42:34vstinnersetmessages: + msg320063
2018-06-20 10:40:13pablogsalsetmessages: - msg320061
2018-06-20 10:40:05pablogsalsetmessages: + msg320062
2018-06-20 10:39:42pablogsalsetmessages: + msg320061
2018-06-20 10:17:36pablogsalsetassignee: pablogsal

nosy: + pablogsal
2018-05-23 10:32:18serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg317380
2018-05-23 10:19:20vstinnersetmessages: + msg317378
2018-05-23 10:18:51vstinnersetmessages: - msg317375
2018-05-23 10:18:49vstinnersetmessages: - msg317376
2018-05-23 10:18:35vstinnersetfiles: - bug.py
2018-05-23 10:17:13vstinnersetmessages: + msg317376
2018-05-23 10:14:25vstinnersetfiles: + bug.py

messages: + msg317375
2018-05-23 10:08:05vstinnersetmessages: + msg317374
2018-05-23 10:04:09vstinnercreate