msg309244 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2017-12-30 19:20 |
It seems test_asyncio fails sporadically on AppVeyor:
https://ci.appveyor.com/project/python/cpython/build/3.7.0a0.10081#L2650
======================================================================
ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.ProactorStartTLS)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\projects\cpython\lib\test\test_asyncio\test_sslproto.py", line 284, in test_start_tls_server_1
asyncio.wait_for(main(), loop=self.loop, timeout=10))
File "C:\projects\cpython\lib\asyncio\base_events.py", line 440, in run_until_complete
return future.result()
File "C:\projects\cpython\lib\asyncio\tasks.py", line 398, in wait_for
raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
======================================================================
FAIL: test_start_tls_server_1 (test.test_asyncio.test_sslproto.ProactorStartTLS)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\projects\cpython\lib\test\test_asyncio\functional.py", line 43, in tearDown
self.fail('unexpected calls to loop.call_exception_handler()')
AssertionError: unexpected calls to loop.call_exception_handler()
|
msg309254 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2017-12-30 20:21 |
I'm leaving on a two-weeks vacation today. To avoid risking breaking the workflow, I'll mask this tests on AppVeyor. I'll investigate this when I get back.
|
msg309255 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2017-12-30 20:40 |
New changeset 0c36bed1c46d07ef91d3e02e69e974e4f3ecd31a by Yury Selivanov in branch 'master':
bpo-32458: Temporarily mask start-tls proactor test on Windows (#5054)
https://github.com/python/cpython/commit/0c36bed1c46d07ef91d3e02e69e974e4f3ecd31a
|
msg309256 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2017-12-30 20:42 |
Please keep this issue open.
|
msg309521 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2018-01-05 20:41 |
After fresh update and debug rebuild on master(3.7), with Win10 patched last night, I got
======================================================================
FAIL: test_call_later (test.test_asyncio.test_events.ProactorEventLoopTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "F:\dev\3x\lib\test\test_asyncio\test_events.py", line 290, in test_call_later
self.assertTrue(0.08 <= t1-t0 <= 0.8, t1-t0)
AssertionError: False is not true : 0.07800000000861473
I have not seen this before. Four subsequent runs passed.
|
msg310405 - (view) |
Author: Andrew Svetlov (asvetlov) * |
Date: 2018-01-22 09:08 |
Hmm, the scheduled call was executed sooner than expected.
Actually we had 0.1 sec delay, checked for at least 0.08 but actual call time was 0.078
BTW I recall an issue with Windows when call_later() was executed sooner already.
|
msg313507 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-03-09 21:38 |
Note, I think this failure is still happening occasionally on the windows7 buildbot; for example:
http://buildbot.python.org/all/#/builders/111/builds/68/steps/3/logs/stdio
http://buildbot.python.org/all/#/builders/111/builds/85/steps/3/logs/stdio
|
msg317468 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-23 22:36 |
test_start_tls_server_1() just failed on my Linux. It likely depends on the system load.
|
msg317638 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-24 23:09 |
It would be nice to fix this bug before Python 3.7.0 final: either skip the test, or fix it. Flaky tests can be very annoying. For example, the full test suite is run to build a Red Hat package. If a single test fails, the package build fails and should be retried whereas it's slow.
I understood that fixing the root cause might require to rewrite the test, so I don't expect a quick fix on this test. The issue is open since last December... Maybe we can skip the test but fix it in Python 3.7.1?
|
msg317639 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-24 23:10 |
I'm OK to skip it for now. Writing functional tests is super hard because some buildbots are super slow and unpredictable.
|
msg317640 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-24 23:12 |
> I'm OK to skip it for now. Writing functional tests is super hard because some buildbots are super slow and unpredictable.
Would you mind to write a PR to skip the PR? So you will feel guilty and will try to remind to fix it!
|
msg317643 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-24 23:23 |
Not at my computer right now, can do it tomorrow.
|
msg317751 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-05-26 18:16 |
Yury, are you still planning to address this?
|
msg317755 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-26 18:45 |
Last time I looked into this I couldn't reproduce the failures on my Windows 10 VM, so it seems like an AppVeyor-specific problem. I'll take another look on Monday.
Andrew, if you have a Windows environment, could you please try to run this test?
|
msg317758 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-05-26 18:49 |
> Last time I looked into this I couldn't reproduce the failures on my Windows 10 VM, so it seems like an AppVeyor-specific problem. I'll take another look on Monday.
But in msg317468 Victor asserts that it sometimes fails on Linux, too?
|
msg317759 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-26 18:54 |
> But in msg317468 Victor asserts that it sometimes fails on Linux, too?
Hm, I missed that. I'll definitely take a look.
|
msg317760 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2018-05-26 19:22 |
It's failing reproducible with OpenSSL 1.1.1 and TLS 1.3 enabled. I haven't seen it failing with TLS 1.2 yet.
|
msg317761 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2018-05-26 19:40 |
I have not seen test_asyncio fail on AppVeyor since about Sunday.
https://ci.appveyor.com/project/python/cpython/history
shows only a couple of failures, which could very well be real.
I have seen test_asyncio fail on Travis since Sunday.
With my local repository 32-bit debug build of master branch:
If I run python -m test.test_asyncio -v,
test_sock_sendfile_not_regular_file (test.test_asyncio.test_proactor_events.ProactorEventLoopUnixSockSendfileTests) ... (and the test process) aborts with a failed debug assertion:
minkernel\crts\ucrt\src\appcrt\lowio\osfinfo.cpp, line 257
fh >= 0 && (unsigned)fh < (unsigned)_nhandle
If I run python -m test test_asyncio, I get
F:\dev\3x\lib\asyncio\sslproto.py:320: ResourceWarning: unclosed transport <asyncio.sslproto._SSLProtocolTransport object at 0x052DA4D0>
source=self)
F:\dev\3x\lib\asyncio\sslproto.py:320: ResourceWarning: unclosed transport <asyncio.sslproto._SSLProtocolTransport object at 0x0476A118>
source=self)
Exception in callback _ProactorReadPipeTransport._loop_reading__data_received(<_OverlappedFuture cancelled>)
handle: <Handle _ProactorReadPipeTransport._loop_reading__data_received(<_OverlappedFuture cancelled>)>
Traceback (most recent call last):
File "F:\dev\3x\lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "F:\dev\3x\lib\asyncio\proactor_events.py", line 222, in _loop_reading__data_received
self._closing)
AssertionError
test_asyncio passed in 35 sec
1 test OK.
With -v, above is scattered in verbose output, but test_start_tls_server_1 passed both times.
Specifically, I ran the following 10 times with no failures.
f:\dev\3x>python -m test -v -m test_start_tls_server_1 test_asyncio
...
test_start_tls_server_1 (test.test_asyncio.test_sslproto.ProactorStartTLSTests) ... ok
test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
|
msg317762 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2018-05-26 19:52 |
I presume the above is using the month-old openssl-bin-1.1.0h.
|
msg317807 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-27 19:40 |
Terry, you somehow deleted Christian from the nosy list.
Christian,
> It's failing reproducible with OpenSSL 1.1.1 and TLS 1.3 enabled. I haven't seen it failing with TLS 1.2 yet.
On Linux or Windows?
|
msg317808 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2018-05-27 19:41 |
Linux
|
msg317809 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2018-05-27 20:14 |
'deleted somehow': Christian added himself while I had this page open, so when I submitted without refreshing, the nosy list did not include him. When I saw the red warning, I did not notice the nosy list change; I should have.
|
msg317817 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-27 23:47 |
Christian, I believe https://github.com/python/cpython/pull/7130/commits/bd17a5593f03e3f2d39a64f5afd3ffb77c97e914 fixes the issue with OpenSSL 1.1.1 (socket.shutdown was missing and asyncio's protocol didn't receive EOF, instead the connection was terminated (as expected though))
|
msg317833 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-28 09:14 |
I just reproduced the issue on Linux. Open 3 terminals and run the commands in parallel:
(1) ./python -m test test_asyncio -m test_start_tls_server_1 -F
(2) ./python -m test -j16 -r
(3) ./python -m test -j16 -r
It's a race condition which doesn't depend on the OS, but on the system load.
|
msg317862 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-28 15:08 |
> It's a race condition which doesn't depend on the OS, but on the system load.
Yeah, I agree. The current timeout for the test is 5 seconds and it tries to transfer 2mb of data, which isn't that much actually. I'll bump the timeout and reduce the amount of data transferred in https://github.com/python/cpython/pull/7130
Thing is, these are the very first *functional* tests for asyncio. Before it was mostly tested with mocks. So we'll need some adjustment period to learn how slow/loaded the CI servers are and what kind of timeouts are safe to use.
|
msg317871 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-28 15:39 |
> Yeah, I agree. The current timeout for the test is 5 seconds and it tries to transfer 2mb of data, which isn't that much actually. I'll bump the timeout and reduce the amount of data transferred in https://github.com/python/cpython/pull/7130
What is the purpose of a timeout of 5 seconds? Some buildbots are really slow.
If the purpose is to detect when the test hangs: use a timeout of 5 minutes, or at least 1 minute (60 seconds).
|
msg317872 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-28 15:40 |
> If the purpose is to detect when the test hangs: use a timeout of 5 minutes, or at least 1 minute (60 seconds).
OK, sounds good, will do that.
|
msg317873 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-28 15:50 |
> OK, sounds good, will do that.
Race condition! I created https://github.com/python/cpython/pull/7157 to increase the timeout from 5 seconds to 60 seconds.
|
msg317898 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-28 18:33 |
https://github.com/python/cpython/pull/7130 is in. It includes many tests for TLS tests as well as bugfixes to asyncio code. Hopefully this all will make tests more stable.
|
msg317907 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-28 19:49 |
New changeset 8267ea2e84d355f00654dec3ad782fc7b1f680f1 by Yury Selivanov in branch 'master':
bpo-32458: Further tune start_tls tests (#7166)
https://github.com/python/cpython/commit/8267ea2e84d355f00654dec3ad782fc7b1f680f1
|
msg317918 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-28 20:41 |
While Yury seems unable to reproduce the bug, it's easy for me to reproduce it on Linux. After 2 hours of debugging, I found the root issue: a race condition not in the test, but in asyncio itself! => bpo-33674 "asyncio: race condition in SSLProtocol".
|
msg317920 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-05-28 20:44 |
New changeset 4b828467a3fcec0c1947e8326f67b8db12a4f303 by Miss Islington (bot) in branch '3.7':
bpo-32458: Further tune start_tls tests (GH-7166)
https://github.com/python/cpython/commit/4b828467a3fcec0c1947e8326f67b8db12a4f303
|
msg318055 - (view) |
Author: Yury Selivanov (yselivanov) * |
Date: 2018-05-29 16:10 |
Should we close this issue now?
|
msg318080 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2018-05-29 20:02 |
I'm not 100% sure that all issues are fixed, but the tests seem much more reliable yet. I close the issue. If the bug reoccurs, I will reopen the issue or open a new one.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:56 | admin | set | github: 76639 |
2018-05-29 20:02:20 | vstinner | set | status: open -> closed priority: deferred blocker -> versions:
+ Python 3.8 messages:
+ msg318080
resolution: fixed stage: patch review -> resolved |
2018-05-29 16:10:17 | yselivanov | set | messages:
+ msg318055 |
2018-05-28 20:44:16 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg317920
|
2018-05-28 20:41:58 | vstinner | set | messages:
+ msg317918 |
2018-05-28 19:50:13 | miss-islington | set | pull_requests:
+ pull_request6805 |
2018-05-28 19:49:01 | yselivanov | set | messages:
+ msg317907 |
2018-05-28 19:22:06 | yselivanov | set | pull_requests:
+ pull_request6801 |
2018-05-28 18:33:54 | yselivanov | set | messages:
+ msg317898 |
2018-05-28 15:50:31 | giampaolo.rodola | set | nosy:
- giampaolo.rodola
|
2018-05-28 15:50:06 | vstinner | set | messages:
+ msg317873 |
2018-05-28 15:47:48 | vstinner | set | pull_requests:
+ pull_request6793 |
2018-05-28 15:40:43 | yselivanov | set | messages:
+ msg317872 |
2018-05-28 15:39:57 | vstinner | set | messages:
+ msg317871 |
2018-05-28 15:08:48 | yselivanov | set | messages:
+ msg317862 |
2018-05-28 09:14:56 | vstinner | set | messages:
+ msg317833 |
2018-05-27 23:47:46 | yselivanov | set | messages:
+ msg317817 |
2018-05-27 20:14:25 | terry.reedy | set | messages:
+ msg317809 |
2018-05-27 19:41:12 | christian.heimes | set | messages:
+ msg317808 |
2018-05-27 19:40:50 | yselivanov | set | nosy:
+ christian.heimes messages:
+ msg317807
|
2018-05-26 19:52:13 | terry.reedy | set | messages:
+ msg317762 |
2018-05-26 19:40:41 | terry.reedy | set | nosy:
- christian.heimes messages:
+ msg317761
|
2018-05-26 19:22:06 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg317760
|
2018-05-26 18:54:49 | yselivanov | set | messages:
+ msg317759 |
2018-05-26 18:49:07 | ned.deily | set | messages:
+ msg317758 |
2018-05-26 18:45:30 | yselivanov | set | messages:
+ msg317755 |
2018-05-26 18:16:12 | ned.deily | set | messages:
+ msg317751 |
2018-05-24 23:23:28 | yselivanov | set | messages:
+ msg317643 |
2018-05-24 23:12:09 | vstinner | set | messages:
+ msg317640 |
2018-05-24 23:10:31 | yselivanov | set | messages:
+ msg317639 |
2018-05-24 23:09:33 | vstinner | set | priority: high -> deferred blocker
messages:
+ msg317638 |
2018-05-23 22:39:21 | pitrou | set | nosy:
- pitrou
|
2018-05-23 22:36:38 | vstinner | set | nosy:
+ vstinner
messages:
+ msg317468 title: test_asyncio failures on Windows -> test_asyncio: test_start_tls_server_1() fails randomly |
2018-03-09 21:38:33 | ned.deily | set | nosy:
+ ned.deily messages:
+ msg313507
|
2018-01-22 09:08:46 | asvetlov | set | messages:
+ msg310405 |
2018-01-05 20:41:54 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg309521
|
2017-12-30 20:42:28 | yselivanov | set | messages:
+ msg309256 |
2017-12-30 20:40:22 | yselivanov | set | messages:
+ msg309255 |
2017-12-30 20:21:38 | yselivanov | set | priority: normal -> high assignee: yselivanov messages:
+ msg309254
components:
+ asyncio, - Library (Lib), Tests |
2017-12-30 20:19:43 | yselivanov | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request4935 |
2017-12-30 19:20:50 | pitrou | create | |