classification
Title: test_queue_feeder_donot_stop_onexc() of test_multiprocessing_spawn fails randomly on x86 Windows7 3.x
Type: Stage: resolved
Components: Tests, Windows Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: davin, grzgrzgrz3, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords:

Created on 2017-06-08 07:40 by vstinner, last changed 2018-01-29 16:16 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 2026 closed vstinner, 2017-06-09 12:35
PR 2027 merged vstinner, 2017-06-09 13:17
PR 2148 merged vstinner, 2017-06-13 07:27
PR 2881 merged vstinner, 2017-07-26 02:48
PR 2883 merged vstinner, 2017-07-26 03:41
PR 5429 merged vstinner, 2018-01-29 15:34
Messages (15)
msg295385 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-08 07:40
http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/712/steps/test/logs/stdio

======================================================================
ERROR: test_queue_feeder_donot_stop_onexc (test.test_multiprocessing_spawn.WithProcessesTestQueue)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\_test_multiprocessing.py", line 767, in test_queue_feeder_donot_stop_onexc
    self.assertTrue(q.get(timeout=0.1))
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\multiprocessing\queues.py", line 105, in get
    raise Empty
queue.Empty
msg295520 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-09 12:28
Hum, it may be a regression a recent change like the commit bc50f03db4f58c869b78e98468e374d7e61f1227:
    bpo-30414: multiprocessing.Queue._feed do not break from main loop on exc (#1683)
msg295525 - (view) Author: Grzegorz Grzywacz (grzgrzgrz3) * Date: 2017-06-09 12:53
Looks like build bot is too slow for timeout=0.1. 

I am guessing `0.1` is too low because we have wrong condition in Queue.get.

It should be.

diff --git a/Lib/multiprocessing/queues.py b/Lib/multiprocessing/queues.py
index dda03dd..42e9884 100644
--- a/Lib/multiprocessing/queues.py
+++ b/Lib/multiprocessing/queues.py
@@ -101,7 +101,7 @@ class Queue(object):
             try:
                 if block:
                     timeout = deadline - time.time()
-                    if timeout < 0 or not self._poll(timeout):
+                    if self._poll(timeout):
                         raise Empty
                 elif not self._poll():
                     raise Empty

If we successfully acquired self._rlock, we should poll no matter how long acquire took.
msg295527 - (view) Author: Grzegorz Grzywacz (grzgrzgrz3) * Date: 2017-06-09 13:04
of course it should be `if not`:

diff --git a/Lib/multiprocessing/queues.py b/Lib/multiprocessing/queues.py
index dda03dd..514f991 100644
--- a/Lib/multiprocessing/queues.py
+++ b/Lib/multiprocessing/queues.py
@@ -101,7 +101,7 @@ class Queue(object):
             try:
                 if block:
                     timeout = deadline - time.time()
-                    if timeout < 0 or not self._poll(timeout):
+                    if not self._poll(timeout):
                         raise Empty
                 elif not self._poll():
                     raise Empty
msg295528 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-09 13:18
> I am guessing `0.1` is too low because we have wrong condition in Queue.get. It should be.

I'm not sure that the buildbot took this code path, but I like your suggestion. Checking the reader in non-blocking mode is cheap.

poll() with timeout < 0 works as poll(0), I checked the UNIX and Windows implementations.

So I proposed your change as a PR, and credited you ;-)
https://github.com/python/cpython/pull/2027
msg295631 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-10 11:10
New changeset 1b7863c3b6519c6e134c28cab8b8af0dea43e375 by Victor Stinner in branch 'master':
bpo-30595: Fix multiprocessing.Queue.get(timeout) (#2027)
https://github.com/python/cpython/commit/1b7863c3b6519c6e134c28cab8b8af0dea43e375
msg295632 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-10 11:15
Thanks Grzegorz Grzywacz! I commited your idea of fix. Let's if it fixes the issue. It will be hard to check since it seems like the bug was random.

Later, I will backport the fix to other branches.
msg295633 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-10 12:35
Let's *see* if it fixes the issue. It will be hard to check since it seems
like the bug was random.
msg295855 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-13 07:30
> New changeset 1b7863c3b6519c6e134c28cab8b8af0dea43e375 by Victor Stinner in branch 'master':
> bpo-30595: Fix multiprocessing.Queue.get(timeout) (#2027)
> https://github.com/python/cpython/commit/1b7863c3b6519c6e134c28cab8b8af0dea43e375

While this change is a nice fix, it isn't enough to make the test stable on slow buildbot: it failed again on the same buildbot,
http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/744

If I modify the timeout from 100 ms to 1 ms, the test also fail on Linux on my laptop. So I proposed https://github.com/python/cpython/pull/2148 to increase the timeout from 100 ms to 1 sec.
msg295961 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-06-13 21:48
New changeset 8f6eeaf21cdf4aea25fdefeec814a1ce07453fe9 by Victor Stinner in branch 'master':
bpo-30595: Increase test_queue_feeder_donot_stop_onexc() timeout (#2148)
https://github.com/python/cpython/commit/8f6eeaf21cdf4aea25fdefeec814a1ce07453fe9
msg299201 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-26 03:38
New changeset e42339d3a08a8fde3349722def85d7a8e49899be by Victor Stinner in branch '3.6':
[3.6] bpo-30595: Fix multiprocessing.Queue.get(timeout) (#2027) (#2881)
https://github.com/python/cpython/commit/e42339d3a08a8fde3349722def85d7a8e49899be
msg299243 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-26 15:52
New changeset ec9a7127b8196dc5ef27c5ee2772ecea7cbb32df by Victor Stinner in branch '2.7':
[3.6] bpo-30595: Fix multiprocessing.Queue.get(timeout) (#2027) (#2881) (#2883)
https://github.com/python/cpython/commit/ec9a7127b8196dc5ef27c5ee2772ecea7cbb32df
msg299244 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-26 15:54
I fixed multiprocessing.Queue.get(timeout) in 2.7, 3.6 and master. I close the issue.
msg311145 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-01-29 15:54
New changeset b60f43a0e6052b29517931eea591b0b29903f382 by Victor Stinner in branch '2.7':
bpo-30595: Increase test_queue_feeder_donot_stop_onexc() timeout (GH-2148) (GH-5429)
https://github.com/python/cpython/commit/b60f43a0e6052b29517931eea591b0b29903f382
msg311146 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-01-29 16:16
The Python 2.7 is to fix this failure:
http://buildbot.python.org/all/#/builders/59/builds/61

test_lock (test.test_multiprocessing.TestForkAwareThreadLock) ... ok
test_ignore (test.test_multiprocessing.TestIgnoreEINTR) ... ok
test test_multiprocessing failed -- Traceback (most recent call last):
  File "/Users/db3l/buildarea/2.7.bolen-tiger/build/Lib/test/test_multiprocessing.py", line 674, in test_queue_feeder_donot_stop_onexc
    self.assertTrue(q.get(timeout=0.1))
  File "/Users/db3l/buildarea/2.7.bolen-tiger/build/Lib/multiprocessing/queues.py", line 132, in get
    raise Empty
Empty
test_ignore_listener (test.test_multiprocessing.TestIgnoreEINTR) ... ok
======================================================================
ERROR: test_queue_feeder_donot_stop_onexc (test.test_multiprocessing.WithProcessesTestQueue)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/db3l/buildarea/2.7.bolen-tiger/build/Lib/test/test_multiprocessing.py", line 674, in test_queue_feeder_donot_stop_onexc
    self.assertTrue(q.get(timeout=0.1))
  File "/Users/db3l/buildarea/2.7.bolen-tiger/build/Lib/multiprocessing/queues.py", line 132, in get
    raise Empty
Empty
History
Date User Action Args
2018-01-29 16:16:17vstinnersetmessages: + msg311146
2018-01-29 15:54:31vstinnersetmessages: + msg311145
2018-01-29 15:34:31vstinnersetpull_requests: + pull_request5264
2017-07-26 15:54:28vstinnersetstatus: open -> closed
versions: + Python 2.7, Python 3.6
messages: + msg299244

resolution: fixed
stage: resolved
2017-07-26 15:52:38vstinnersetmessages: + msg299243
2017-07-26 03:41:27vstinnersetpull_requests: + pull_request2935
2017-07-26 03:38:24vstinnersetmessages: + msg299201
2017-07-26 02:48:17vstinnersetpull_requests: + pull_request2933
2017-06-13 21:48:50vstinnersetmessages: + msg295961
2017-06-13 07:30:33vstinnersetmessages: + msg295855
2017-06-13 07:27:02vstinnersetpull_requests: + pull_request2198
2017-06-10 12:35:20vstinnersetmessages: + msg295633
2017-06-10 11:15:31vstinnersetmessages: + msg295632
2017-06-10 11:10:15vstinnersetmessages: + msg295631
2017-06-09 13:18:57vstinnersetmessages: + msg295528
2017-06-09 13:17:29vstinnersetpull_requests: + pull_request2093
2017-06-09 13:04:46grzgrzgrz3setmessages: + msg295527
2017-06-09 12:53:11grzgrzgrz3setnosy: + grzgrzgrz3
messages: + msg295525
2017-06-09 12:35:14vstinnersetpull_requests: + pull_request2092
2017-06-09 12:28:07vstinnersetmessages: + msg295520
2017-06-08 07:40:26vstinnercreate