Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_asyncio: test_start_tls_server_1() TimeoutError on Fedora 29 #80179

Closed
matrixise opened this issue Feb 15, 2019 · 44 comments
Closed

test_asyncio: test_start_tls_server_1() TimeoutError on Fedora 29 #80179

matrixise opened this issue Feb 15, 2019 · 44 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes deferred-blocker tests Tests in the Lib/test dir topic-asyncio

Comments

@matrixise
Copy link
Member

BPO 35998
Nosy @doko42, @ncoghlan, @vstinner, @tiran, @ned-deily, @asvetlov, @ambv, @1st1, @koobs, @matrixise, @stratakis, @pablogsal, @miss-islington, @tirkarthi, @openandclose
PRs
  • bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() #14080
  • [3.8] bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() (GH-14080) #14084
  • [3.7] bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() (GH-14080) #14086
  • [3.8] bpo-35031, test_asycio: disable TLS 1.3 in test_start_tls_server_1() (GH-14148) #14150
  • [3.7] bpo-35031, test_asycio: disable TLS 1.3 in test_start_tls_server_1() (GH-14148) #14151
  • bpo-35998: Fix test_asyncio.test_start_tls_server_1() #16815
  • [3.8] bpo-35998: Fix test_asyncio.test_start_tls_server_1() (GH-16815) #16817
  • [3.7] bpo-35998: Fix test_asyncio.test_start_tls_server_1() (GH-16815) #16818
  • Files
  • stdout.txt
  • stderr.txt
  • stderr-op368.txt: test_start_tls_server_1 only
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-10-16.09:59:14.787>
    created_at = <Date 2019-02-15.10:12:18.033>
    labels = ['deferred-blocker', '3.8', '3.9', '3.7', 'tests', 'expert-asyncio']
    title = 'test_asyncio: test_start_tls_server_1() TimeoutError on Fedora 29'
    updated_at = <Date 2019-10-16.09:59:14.773>
    user = 'https://github.com/matrixise'

    bugs.python.org fields:

    activity = <Date 2019-10-16.09:59:14.773>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-10-16.09:59:14.787>
    closer = 'vstinner'
    components = ['Tests', 'asyncio']
    creation = <Date 2019-02-15.10:12:18.033>
    creator = 'matrixise'
    dependencies = []
    files = ['48143', '48144', '48185']
    hgrepos = []
    issue_num = 35998
    keywords = ['patch', '3.7regression']
    message_count = 44.0
    messages = ['335596', '335597', '335626', '335629', '335635', '336986', '336991', '337125', '337147', '337239', '337563', '337974', '338129', '338368', '338749', '345597', '345598', '345605', '345607', '345608', '345623', '345640', '345644', '345645', '345683', '345807', '345830', '345833', '345838', '345840', '350294', '350341', '354653', '354655', '354656', '354727', '354758', '354759', '354760', '354761', '354762', '354784', '354793', '354794']
    nosy_count = 15.0
    nosy_names = ['doko', 'ncoghlan', 'vstinner', 'christian.heimes', 'ned.deily', 'asvetlov', 'lukasz.langa', 'yselivanov', 'koobs', 'matrixise', 'cstratak', 'pablogsal', 'miss-islington', 'xtreak', 'op368']
    pr_nums = ['14080', '14084', '14086', '14150', '14151', '16815', '16817', '16818']
    priority = 'deferred blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue35998'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @matrixise
    Copy link
    Member Author

    ======================================================================
    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/stephane/src/github.com/python/cpython-original/Lib/test/test_asyncio/test_sslproto.py", line 510, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "/home/stephane/src/github.com/python/cpython-original/Lib/asyncio/base_events.py", line 589, in run_until_complete
        return future.result()
      File "/home/stephane/src/github.com/python/cpython-original/Lib/test/test_asyncio/test_sslproto.py", line 503, in run_main
        await asyncio.wait_for(
      File "/home/stephane/src/github.com/python/cpython-original/Lib/asyncio/tasks.py", line 461, in wait_for
        raise exceptions.TimeoutError()
    asyncio.exceptions.TimeoutError

    my current revision: 3e028b2

    system: fedora 29
    compiled with gcc

    gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6)

    No issue on Travis, but this test fails on my computer and I cleaned my repo with git clean -dfqx

    @matrixise matrixise added the 3.8 only security fixes label Feb 15, 2019
    @matrixise
    Copy link
    Member Author

    here is the stderr.txt

    @vstinner vstinner added tests Tests in the Lib/test dir topic-asyncio labels Feb 15, 2019
    @tirkarthi
    Copy link
    Member

    Not sure if it's related but there was a similar report on failure in freebsd bots with tls1.3 : bpo-35031 . Can you please add SSL info with ./python -m test.pythoninfo | grep ssl ?

    Test runs fine on my ubuntu 16.04 machine with below info

    $ ./python -m test.pythoninfo | grep ssl
    ssl.HAS_SNI: True
    ssl.OPENSSL_VERSION: OpenSSL 1.0.2g  1 Mar 2016
    ssl.OPENSSL_VERSION_INFO: (1, 0, 2, 7, 15)
    ssl.OP_ALL: 0x800003ff
    ssl.OP_NO_TLSv1_1: 0x10000000

    @vstinner vstinner changed the title ./python -m test test_asyncio fails test_asyncio: test_start_tls_server_1() TimeoutError on Fedora 29 Feb 15, 2019
    @stratakis
    Copy link
    Mannequin

    stratakis mannequin commented Feb 15, 2019

    Fedora 29 has openssl 1.1.1 which seems to be related.

    @tirkarthi
    Copy link
    Member

    I can confirm that this fails on a fresh fedora 29 VM as of commit 355f16f. The changes in PR 10011 make this test pass.

    SSL version and OS details

    $ cat /etc/fedora-release
    Fedora release 29 (Twenty Nine)
    $ uname -a
    Linux fedora-s-1vcpu-1gb-blr1-01 4.20.7-200.fc29.x86_64 #1 SMP Wed Feb 6 19:16:42 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
    $ ./python -m test.pythoninfo | grep ssl
    ssl.HAS_SNI: True
    ssl.OPENSSL_VERSION: OpenSSL 1.1.1a FIPS  20 Nov 2018
    ssl.OPENSSL_VERSION_INFO: (1, 1, 1, 1, 15)
    ssl.OP_ALL: 0x80000054
    ssl.OP_NO_TLSv1_1: 0x10000000

    # Test failure with tracemalloc

    $ ./python -X tracemalloc -m unittest -v test.test_asyncio.test_sslproto.SelectorStartTLSTests
    test_buf_feed_data (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_create_connection_ssl_failed_certificate (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_create_connection_ssl_slow_handshake (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_handshake_timeout (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_buf_proto_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_corrupted_ssl (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_reg_proto_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... /root/cpython/Lib/asyncio/sslproto.py:321: ResourceWarning: unclosed transport <asyncio.sslproto._SSLProtocolTransport object at 0x7f2087642170>
      _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
    Object allocated at (most recent call last):
      File "/root/cpython/Lib/asyncio/sslproto.py", lineno 446
        self._app_transport = _SSLProtocolTransport(self._loop, self)
    ERROR
    test_start_tls_slow_client_cancel (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... Unhandled error in exception handler
    context: {'message': 'Future exception was never retrieved', 'exception': ConnectionResetError(104, 'Connection reset by peer'), 'future': <Future finished exception=ConnectionResetError(104, 'Connection reset by peer')>}
    Traceback (most recent call last):
      File "/root/cpython/Lib/asyncio/base_events.py", line 1653, in call_exception_handler
        self._exception_handler(self, context)
      File "/root/cpython/Lib/test/test_asyncio/functional.py", line 22, in loop_exception_handler
        self.loop.default_exception_handler(context)
    AttributeError: 'NoneType' object has no attribute 'default_exception_handler'
    /root/cpython/Lib/asyncio/base_events.py:675: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 34497)>
      timer = events.TimerHandle(when, callback, args, self, context)
    Object allocated at (most recent call last):
      File "/root/cpython/Lib/asyncio/base_events.py", lineno 1360
        sock = socket.socket(af, socktype, proto)
    ok
    test_start_tls_wrong_args (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok

    ======================================================================
    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/root/cpython/Lib/test/test_asyncio/test_sslproto.py", line 510, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "/root/cpython/Lib/asyncio/base_events.py", line 589, in run_until_complete
        return future.result()
      File "/root/cpython/Lib/test/test_asyncio/test_sslproto.py", line 503, in run_main
        await asyncio.wait_for(
      File "/root/cpython/Lib/asyncio/tasks.py", line 461, in wait_for
        raise exceptions.TimeoutError()
    asyncio.exceptions.TimeoutError

    Ran 10 tests in 63.797s

    FAILED (errors=1)

    # With PR 10011 applied to Fedora

    $ ./python -X tracemalloc -m unittest -v test.test_asyncio.test_sslproto.SelectorStartTLSTests
    test_buf_feed_data (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_create_connection_ssl_failed_certificate (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_create_connection_ssl_slow_handshake (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_handshake_timeout (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_buf_proto_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_corrupted_ssl (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_client_reg_proto_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_slow_client_cancel (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok
    test_start_tls_wrong_args (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... ok

    Ran 10 tests in 3.734s

    OK

    @openandclose
    Copy link
    Mannequin

    openandclose mannequin commented Mar 2, 2019

    I have the same TimeoutError on Arch linux PC (openssl 1.1.1a).

    It is random, but for now running test continuously at most 5 times successfully brings about the fail.

    So I think msg335635 needs re-checking.
    He saids PR 10011 makes this test pass in fedora 29 VM,
    but thinking about that it only adds an conditional to FreeBSD, it is not likely.

        if sys.platform.startswith('freebsd'):
            client_context.options |= ssl.OP_NO_TLSv1_3

    And indded, if commenting out the conditional line,
    the test actually passes.
    So I also think the commit needs re-considering.
    f777fa5 (Opt out of TLS 1.3 only on FreeBSD)

    ---

    $ uname -a
    Linux **** 4.20.4-arch1-1-ARCH #1 SMP PREEMPT Wed Jan 23 00:12:22 UTC 2019 x86_64 GNU/Linux
    (native, not venv nor vm)
    
    $ ./python -m test.pythoninfo | grep ssl
    ssl.HAS_SNI: True
    ssl.OPENSSL_VERSION: OpenSSL 1.1.1a  20 Nov 2018
    ssl.OPENSSL_VERSION_INFO: (1, 1, 1, 1, 15)
    ssl.OP_ALL: 0x80000054
    ssl.OP_NO_TLSv1_1: 0x10000000
    
    $ ./python -X tracemalloc -m unittest -v test.test_asyncio.test_sslproto.SelectorStartTLSTests.test_start_tls_server_1
    ...

    ======================================================================
    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/philo/me/git/cpython/Lib/test/test_asyncio/test_sslproto.py", line 510, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "/home/philo/me/git/cpython/Lib/asyncio/base_events.py", line 589, in run_until_complete
        return future.result()
      File "/home/philo/me/git/cpython/Lib/test/test_asyncio/test_sslproto.py", line 503, in run_main
        await asyncio.wait_for(
      File "/home/philo/me/git/cpython/Lib/asyncio/tasks.py", line 461, in wait_for
        raise exceptions.TimeoutError()
    asyncio.exceptions.TimeoutError

    Ran 1 test in 60.076s

    @openandclose
    Copy link
    Mannequin

    openandclose mannequin commented Mar 2, 2019

    It may be openssl 1.1.1a specific.
    I don't understand most of what they say,
    but the changelog (to 1.1.1b) 'looks' suspicious.
    https://www.openssl.org/news/changelog.html#x1

    ...Actually Archlinux updated openssl to 1.1.1b about 4 days ago,
    so I can 'test' 1.1.1b.
    But which means I may not be able to reproduce the fail after that.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2019

    Interesting code in test_ssl.py:

                except (ConnectionResetError, BrokenPipeError) as e:
                    # We treat ConnectionResetError as though it were an
                    # SSLError - OpenSSL on Ubuntu abruptly closes the
                    # connection when asked to use an unsupported protocol.
                    #
                    # BrokenPipeError is raised in TLS 1.3 mode, when OpenSSL
                    # tries to send session tickets after handshake.
                    # https://github.com/openssl/openssl/issues/6342
                    self.server.conn_errors.append(str(e))
                    if self.server.chatty:
                        handle_error("\n server:  bad connection attempt from " + repr(self.addr) + ":\n")
                    self.running = False
                    self.close()
                    return False

    and

                    except ConnectionResetError:
                        # XXX: OpenSSL 1.1.1 sometimes raises ConnectionResetError
                        # when connection is not shut down gracefully.
                        if self.server.chatty and support.verbose:
                            sys.stdout.write(
                                " Connection reset by peer: {}\n".format(
                                    self.addr)
                            )
                        self.close()
                        self.running = False

    Interesting commit:

    commit 529525f
    Author: Christian Heimes <christian@python.org>
    Date: Wed May 23 22:24:45 2018 +0200

    bpo-33618: Enable TLS 1.3 in tests (GH-7079)
    
    TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS
    client cert auth are now handled after the initialy handshake. Tests now
    either send/recv data to trigger session and client certs. Or tests
    ignore ConnectionResetError / BrokenPipeError on the server side to
    handle clients that force-close the socket fd.
    
    To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR
    https://github.com/openssl/openssl/pull/6340) is required.
    
    Signed-off-by: Christian Heimes <christian@python.org>
    

    @openandclose
    Copy link
    Mannequin

    openandclose mannequin commented Mar 4, 2019

    uploading stderr-op368.txt.
    (basically the same as Stéphane and Karthikeyan.)

    @matrixise
    Copy link
    Member Author

    ./python -m test -W -u network -j0 -r -F test_asyncio

    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/stephane/src/github.com/python/cpython/Lib/test/test_asyncio/test_sslproto.py", line 510, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "/home/stephane/src/github.com/python/cpython/Lib/asyncio/base_events.py", line 589, in run_until_complete
        return future.result()
      File "/home/stephane/src/github.com/python/cpython/Lib/test/test_asyncio/test_sslproto.py", line 503, in run_main
        await asyncio.wait_for(
      File "/home/stephane/src/github.com/python/cpython/Lib/asyncio/tasks.py", line 461, in wait_for
        raise exceptions.TimeoutError()
    asyncio.exceptions.TimeoutError

    Ran 2044 tests in 113.604s

    openssl 1.1.1b

    revision: d8b3a98

    @openandclose
    Copy link
    Mannequin

    openandclose mannequin commented Mar 9, 2019

    I updated my PC (OpenSSL 1.1.1b).
    the same TimeoutError as before.

    $ ./python -m test.pythoninfo | grep ssl
    ssl.HAS_SNI: True
    ssl.OPENSSL_VERSION: OpenSSL 1.1.1b  26 Feb 2019
    ssl.OPENSSL_VERSION_INFO: (1, 1, 1, 2, 15)
    ssl.OP_ALL: 0x80000054
    ssl.OP_NO_TLSv1_1: 0x10000000

    @doko42
    Copy link
    Member

    doko42 commented Mar 15, 2019

    seen in Ubuntu 19.04 as well. Started after the openssl update form 1.1.1 to 1.1.1b.

    @doko42
    Copy link
    Member

    doko42 commented Mar 17, 2019

    setting to release-blocker for evaluation. the freebsd solution seems to be wrong, just avoiding the error. Is the test correct, or do we have an issue that TLSv1.3 isn't properly handled?

    @stratakis
    Copy link
    Mannequin

    stratakis mannequin commented Mar 19, 2019

    On my system with openssl 1.1.1b, by reducing the PAYLOAD_SIZE the test passes successfully.

    It starts failing when it's bigger than 1024 * 95

    @ned-deily
    Copy link
    Member

    I am changing the priority of this to "deferred blocker" as our current strategy outlined in bpo-34631 is to target full support of OpenSSL 1.1.1 in 3.7.4 and prior to 3.8.0b1.

    @vstinner
    Copy link
    Member

    New changeset f0749da by Victor Stinner (Andrew Svetlov) in branch 'master':
    bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() (GH-14080)
    f0749da

    @asvetlov
    Copy link
    Contributor

    Finally fixed

    @vstinner
    Copy link
    Member

    Finally fixed

    Thank you very much! I looked at this issue 2 or 3 times but I failed to fix it. This bug was super annoying: it failed multiple times per week on Fedora buildbots.

    @miss-islington
    Copy link
    Contributor

    New changeset 0c2eb6d by Miss Islington (bot) in branch '3.8':
    bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() (GH-14080)
    0c2eb6d

    @vstinner
    Copy link
    Member

    New changeset 33feb2e by Victor Stinner in branch '3.7':
    bpo-35998: Avoid TimeoutError in test_asyncio: test_start_tls_server_1() (GH-14080) (GH-14086)
    33feb2e

    @asvetlov
    Copy link
    Contributor

    Thanks for the review, Victor!
    I have had to fix it much earlier :(

    @vstinner
    Copy link
    Member

    I have had to fix it much earlier :(

    That's fine. If this bug was more annoying, I would have it fixed myself
    earlier. But I had other more critical bugs to fix before that one. The
    important part is that it is fixed now.

    @vstinner
    Copy link
    Member

    Sadly, the fix is not perfect, the test failed on the very "x86 Windows7 3.x" buildbot:
    https://buildbot.python.org/all/#/builders/58/builds/2627

    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_asyncio\test_sslproto.py", line 578, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\asyncio\base_events.py", line 606, in run_until_complete
        raise RuntimeError('Event loop stopped before Future completed.')
    RuntimeError: Event loop stopped before Future completed.

    @vstinner vstinner reopened this Jun 14, 2019
    @vstinner
    Copy link
    Member

    x86 Windows7 3.7:
    https://buildbot.python.org/all/#/builders/111/builds/1123

    test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... Exception in thread test-client:
    Traceback (most recent call last):
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\functional.py", line 198, in run
        self._prog(TestSocketWrapper(self._sock))
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\test_sslproto.py", line 565, in <lambda>
        with self.tcp_client(lambda sock: client(sock, addr),
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\test_sslproto.py", line 510, in client
        answer = sock.recv_all(len(ANSWER))
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\functional.py", line 138, in recv_all
        data = self.recv(n - len(buf))
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\ssl.py", line 1045, in recv
        return self.read(buflen)
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\ssl.py", line 920, in read
        return self._sslobj.read(len)
    ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2488)
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\threading.py", line 926, in _bootstrap_inner
        self.run()
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\functional.py", line 200, in run
        self._test._abort_socket_test(ex)
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\test\test_asyncio\functional.py", line 122, in _abort_socket_test
        self.fail(ex)
      File "D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\unittest\case.py", line 693, in fail
        raise self.failureException(msg)
    AssertionError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2488)

    D:\cygwin\home\db3l\buildarea\3.7.bolen-windows7\build\lib\threading.py:960: ResourceWarning: unclosed <ssl.SSLSocket fd=772, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 64078), raddr=('127.0.0.1', 64077)>
    del exc_type, exc_value, exc_tb
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    ERROR

    @asvetlov
    Copy link
    Contributor

    Ooops. Looks like a real problem, not test-only :(

    @ned-deily
    Copy link
    Member

    What to do here? This was originally filed as a "release blocker" and I allowed 3.7.3 to go out without a resolution for it since we had agreed to @christian.heimes proposal to defer full support of OpenSSL 1.1.1 until 3.7.4. Now we are approaching code cutoff for 3.7.4.

    @asvetlov
    Copy link
    Contributor

    No idea what to do.
    SSL support is obviously broken.
    Current code is too complex to maintain.

    Yuri has a full reimplementation of ssl_proto.py in uvloop.
    It *seem* to work and can be backported.

    I'm sick now, can be ready to work back in a day or two.

    @vstinner
    Copy link
    Member

    New changeset c034b78 by Victor Stinner in branch 'master':
    bpo-35031, test_asycio: disable TLS 1.3 in test_start_tls_server_1() (GH-14148)
    c034b78

    @miss-islington
    Copy link
    Contributor

    New changeset a5ddbfb by Miss Islington (bot) in branch '3.8':
    bpo-35031, test_asycio: disable TLS 1.3 in test_start_tls_server_1() (GH-14148)
    a5ddbfb

    @miss-islington
    Copy link
    Contributor

    New changeset 0040903 by Miss Islington (bot) in branch '3.7':
    bpo-35031, test_asycio: disable TLS 1.3 in test_start_tls_server_1() (GH-14148)
    0040903

    @ambv
    Copy link
    Contributor

    ambv commented Aug 23, 2019

    Should this be closed?

    @openandclose
    Copy link
    Mannequin

    openandclose mannequin commented Aug 24, 2019

    For me, since #58288, It's been fixed. (arch linux)

    @vstinner
    Copy link
    Member

    I closed bpo-37431 as a duplicate of this issue.

    @vstinner
    Copy link
    Member

    I didn't check if it's the same issue than previous failures, but test_start_tls_server_1() just failed on x86 Gentoo Installed with X 3.8 buildbot:

    pythoninfo:

    ssl.HAS_SNI: True
    ssl.OPENSSL_VERSION: OpenSSL 1.1.1d 10 Sep 2019
    ssl.OPENSSL_VERSION_INFO: (1, 1, 1, 4, 15)
    ssl.OP_ALL: -0x7fffffac
    ssl.OP_NO_TLSv1_1: 0x10000000
    ssl.SSLContext.maximum_version: TLSVersion.MAXIMUM_SUPPORTED
    ssl.SSLContext.minimum_version: TLSVersion.MINIMUM_SUPPORTED
    ssl.SSLContext.options: Options.OP_NO_SSLv3|OP_CIPHER_SERVER_PREFERENCE|OP_ENABLE_MIDDLEBOX_COMPAT|OP_NO_COMPRESSION|OP_ALL
    ssl.SSLContext.protocol: _SSLMethod.PROTOCOL_TLS
    ssl.SSLContext.verify_mode: VerifyMode.CERT_NONE
    ssl.default_https_context.maximum_version: TLSVersion.MAXIMUM_SUPPORTED
    ssl.default_https_context.minimum_version: TLSVersion.MINIMUM_SUPPORTED
    ssl.default_https_context.options: Options.OP_NO_SSLv3|OP_CIPHER_SERVER_PREFERENCE|OP_ENABLE_MIDDLEBOX_COMPAT|OP_NO_COMPRESSION|OP_ALL
    ssl.default_https_context.protocol: _SSLMethod.PROTOCOL_TLS
    ssl.default_https_context.verify_mode: VerifyMode.CERT_REQUIRED
    ssl.stdlib_context.maximum_version: TLSVersion.MAXIMUM_SUPPORTED
    ssl.stdlib_context.minimum_version: TLSVersion.MINIMUM_SUPPORTED
    ssl.stdlib_context.options: Options.OP_NO_SSLv3|OP_CIPHER_SERVER_PREFERENCE|OP_ENABLE_MIDDLEBOX_COMPAT|OP_NO_COMPRESSION|OP_ALL
    ssl.stdlib_context.protocol: _SSLMethod.PROTOCOL_TLS
    ssl.stdlib_context.verify_mode: VerifyMode.CERT_NONE

    https://buildbot.python.org/all/#/builders/230/builds/435

    test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests) ... Exception in thread test-client:
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 190, in run
        self._prog(TestSocketWrapper(self._sock))
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_sslproto.py", line 575, in <lambda>
        with self.tcp_client(lambda sock: client(sock, addr),
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_sslproto.py", line 520, in client
        answer = sock.recv_all(len(ANSWER))
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 130, in recv_all
        data = self.recv(n - len(buf))
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/ssl.py", line 1226, in recv
        return self.read(buflen)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/ssl.py", line 1101, in read
        return self._sslobj.read(len)
    ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2560)
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/threading.py", line 932, in _bootstrap_inner
        self.run()
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 192, in run
        self._test._abort_socket_test(ex)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 114, in _abort_socket_test
        self.fail(ex)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/unittest/case.py", line 753, in fail
        raise self.failureException(msg)
    AssertionError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2560)
    /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/threading.py:934: ResourceWarning: unclosed <ssl.SSLSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 59752), raddr=('127.0.0.1', 45469)>
      self._invoke_excepthook(self)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    ERROR
    
    
    Unhandled error in exception handler
    context: {'message': 'Task was destroyed but it is pending!', 'task': <Task pending name='Task-1699' coro=<BaseStartTLS.test_start_tls_server_1.<locals>.main() running at /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_sslproto.py:555> wait_for=<Future finished result=None> cb=[_release_waiter(<Future pendi...xb6a27118>()]>)() at /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/tasks.py:429]>}
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/base_events.py", line 1729, in call_exception_handler
        self._exception_handler(self, context)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 22, in loop_exception_handler
        self.loop.default_exception_handler(context)
    AttributeError: 'NoneType' object has no attribute 'default_exception_handler'
    /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_streams.py:40: ResourceWarning: unclosed <socket.socket fd=8, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 45469)>
      gc.collect()
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/selector_events.py:684: ResourceWarning: unclosed transport <_SelectorSocketTransport fd=8>
      _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/sslproto.py:321: ResourceWarning: unclosed transport <asyncio.sslproto._SSLProtocolTransport object at 0xb5c82b38>
      _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
    ResourceWarning: Enable tracemalloc to get the object allocation traceback
    Unhandled error in exception handler
    context: {'message': 'Fatal error on SSL transport', 'exception': RuntimeError('Event loop is closed'), 'transport': <_SelectorSocketTransport closing fd=8>, 'protocol': <asyncio.sslproto.SSLProtocol object at 0xb6a27448>}
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/selector_events.py", line 898, in write
        n = self._sock.send(data)
    OSError: [Errno 9] Bad file descriptor
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/sslproto.py", line 685, in _process_write_backlog
        self._transport.write(chunk)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/selector_events.py", line 904, in write
        self._fatal_error(exc, 'Fatal write error on socket transport')
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/selector_events.py", line 699, in _fatal_error
        self._force_close(exc)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/selector_events.py", line 711, in _force_close
        self._loop.call_soon(self._call_connection_lost, exc)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/base_events.py", line 711, in call_soon
        self._check_closed()
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/base_events.py", line 504, in _check_closed
        raise RuntimeError('Event loop is closed')
    RuntimeError: Event loop is closed
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/base_events.py", line 1729, in call_exception_handler
        self._exception_handler(self, context)
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/functional.py", line 22, in loop_exception_handler
        self.loop.default_exception_handler(context)
    AttributeError: 'NoneType' object has no attribute 'default_exception_handler'
    /buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_streams.py:40: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 45469)>
      gc.collect()
    ResourceWarning: Enable tracemalloc to get the object allocation traceback

    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/test/test_asyncio/test_sslproto.py", line 585, in test_start_tls_server_1
        self.loop.run_until_complete(run_main())
      File "/buildbot/buildarea/cpython/3.8.ware-gentoo-x86.installed/build/target/lib/python3.8/asyncio/base_events.py", line 606, in run_until_complete
        raise RuntimeError('Event loop stopped before Future completed.')
    RuntimeError: Event loop stopped before Future completed.

    @vstinner
    Copy link
    Member

    Similar failure on x86 Gentoo Non-Debug with X 3.x:
    https://buildbot.python.org/all/#/builders/99/builds/3472

    ERROR: test_start_tls_server_1 (test.test_asyncio.test_sslproto.SelectorStartTLSTests)

    @vstinner
    Copy link
    Member

    Zachary Ware commented the new failures on the x86 Gentoo worker:

    "My worker was updated to OpenSSL v1.1.1d with sslv3 disabled today, which seems like a likely source of these failures."

    @vstinner
    Copy link
    Member

    I can reproduce the bug on my laptop with the command:

    ./python -u -m test test_asyncio -m test.test_asyncio.test_sslproto.SelectorStartTLSTests.test_start_tls_server_1 -F -j100

    It's a race condition in the test itself, not in asyncio. PR 16815 fix it.

    I also got access to Zachary Ware's Gentoo buildbot worker where I can also easily reproduce the bug using "... -F -j2".

    @vstinner
    Copy link
    Member

    I also got access to Zachary Ware's Gentoo buildbot worker where I can also easily reproduce the bug using "... -F -j2".

    FYI disabling TLSv1.3 works around the issue on this worker. (It doesn't fix the bug, only hides it.)

    @vstinner
    Copy link
    Member

    FYI "test_start_tls_server_1()" is failing since at least 2017-12-30:
    https://bugs.python.org/issue32458#msg309244

    I hope that my latest fix will be the last to finally make the test stable.

    But I always write that, and the bug always strike me in my back. So I'm now hopeless :-)

    @vstinner
    Copy link
    Member

    New changeset fab4ef2 by Victor Stinner in branch 'master':
    bpo-35998: Fix test_asyncio.test_start_tls_server_1() (GH-16815)
    fab4ef2

    @vstinner
    Copy link
    Member

    This change needs to be backported to 3.7 and 3.8, but I prefer to wait to see if buildbots like this new change, before doing the backport.

    @miss-islington
    Copy link
    Contributor

    New changeset afbcd9f by Miss Islington (bot) in branch '3.8':
    bpo-35998: Fix test_asyncio.test_start_tls_server_1() (GH-16815)
    afbcd9f

    @vstinner
    Copy link
    Member

    New changeset 1bceb0e by Victor Stinner in branch '3.7':
    bpo-35998: Fix test_asyncio.test_start_tls_server_1() (GH-16815) (GH-16818)
    1bceb0e

    @vstinner
    Copy link
    Member

    Buildbots are green on the 3.x branch. I backported my fix to 3.7 and 3.8.

    I close the issue. I hope that I will not have to reopen it again.

    @vstinner vstinner added the 3.9 only security fixes label Oct 16, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes deferred-blocker tests Tests in the Lib/test dir topic-asyncio
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants