Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio ssl memory leak #78926

Closed
thehesiod mannequin opened this issue Sep 19, 2018 · 8 comments
Closed

asyncio ssl memory leak #78926

thehesiod mannequin opened this issue Sep 19, 2018 · 8 comments
Labels
3.7 (EOL) end of life performance Performance or resource usage topic-asyncio

Comments

@thehesiod
Copy link
Mannequin

thehesiod mannequin commented Sep 19, 2018

BPO 34745
Nosy @fantix, @asvetlov, @1st1, @thehesiod, @miss-islington, @cnpeterson
PRs
  • bpo-34745: Fix asyncio sslproto memory issues #12386
  • [3.7] bpo-34745: Fix asyncio sslproto memory issues (GH-12386) #12387
  • [3.6] bpo-34745: Fix asyncio sslproto memory issues (GH-12386) #12393
  • Files
  • memory_usage.png: Memory Usage
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-03-25.06:10:45.957>
    created_at = <Date 2018-09-19.22:41:00.569>
    labels = ['expert-asyncio', '3.7', 'performance']
    title = 'asyncio ssl memory leak'
    updated_at = <Date 2019-03-25.06:10:45.956>
    user = 'https://github.com/thehesiod'

    bugs.python.org fields:

    activity = <Date 2019-03-25.06:10:45.956>
    actor = 'thehesiod'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-03-25.06:10:45.957>
    closer = 'thehesiod'
    components = ['asyncio']
    creation = <Date 2018-09-19.22:41:00.569>
    creator = 'thehesiod'
    dependencies = []
    files = ['48012']
    hgrepos = []
    issue_num = 34745
    keywords = ['patch']
    message_count = 8.0
    messages = ['325811', '325813', '325817', '325819', '332273', '338145', '338146', '338784']
    nosy_count = 6.0
    nosy_names = ['fantix', 'asvetlov', 'yselivanov', 'thehesiod', 'miss-islington', 'cnpeterson']
    pr_nums = ['12386', '12387', '12393']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue34745'
    versions = ['Python 3.6', 'Python 3.7']

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Sep 19, 2018

    I've been trying to track down a leak in aiohttp: aio-libs/aiohttp#3010

    it seems like this leak now occurs with raw asyncio SSL sockets.

    when the gist script is run like so: python3.7 which mprof run --interval=1 ~/dev/test_leak.py -test asyncio_test

    it slowly leaks memory. This is effectively doing the following:

    URLS = {
        'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': {
            'method': 'get',
            'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'}
        },
    'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': {
        'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD',
        'headers': {'User-Agent': 'Botocore/1.8.21 [Python/3.6.4](https://github.com/python/cpython/blob/main/Python/3.6.4) Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'}
    }
    

    }

    class HttpClient(asyncio.streams.FlowControlMixin):
        transport = None
    
        def __init__(self, *args, **kwargs):
            self.__url = kwargs.pop('url')
            self.__logger = logging.getLogger()
            super().__init__()
    
        def connection_made(self, transport):
            self.transport = transport
    
            url_parts = urlparse(self.__url)
            entry = URLS[self.__url]
    
            body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.hostname}\r\n'
            for name, value in entry['headers'].items():
                body += f'{name}: {value}\r\n'
        body += '\r\n'
        self.transport.write(body.encode('ascii'))
        self.__logger.info(f'data sent: {body}')
    
        def data_received(self, data):
            self.__logger.info(f'data received: {data}')
    
            self.transport.close()
            # asyncio.get_event_loop().call_later(1.0, )
    
        def eof_received(self):
            self.__logger.info('eof_received')
    
        def connection_lost(self, exc):
            self.__logger.info(f'connection lost: {exc}')
            super().connection_lost(exc)
    
        @classmethod
        def create_factory(cls, url: str):
            def factory(*args, **kwargs):
                return cls(*args, url=url, **kwargs)
    
            return factory
    
    
    async def test_asyncio(ssl_context):
        loop = asyncio.get_event_loop()
    
        url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist'
        url_parts = urlparse(url)
        port = url_parts.port or (80 if url_parts.scheme == 'http' else 443)
        infos = await loop.getaddrinfo(url_parts.hostname, port, family=socket.AF_INET)
        family, type, proto, canonname, sockaddr = infos[0]
        await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.hostname, local_addr=None)
    
    
    async def asyncio_test():
        ssl_context = ssl.create_default_context()
    
        while True:
            await test_asyncio(ssl_context)

    await asyncio_test()

    @thehesiod thehesiod mannequin added 3.7 (EOL) end of life topic-asyncio performance Performance or resource usage labels Sep 19, 2018
    @1st1
    Copy link
    Member

    1st1 commented Sep 19, 2018

    What is "raw asyncio SSL sockets"? We don't use SSL sockets in asyncio, we use SSL Memory BIO. Do you think that some SSL context objects aren't being properly released?

    BTW, can you see the leak when run under uvloop?

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Sep 19, 2018

    sorry, by "raw" I mean in the context of aiohttp, so just using the normal python ssl context and asyncio sockets. I don't think it's an object not getting GC'd because I didn't see any increase on object counts, nor leaks per tracemalloc. I think it's some low-level native memory leaked by openssl.

    I've updated the gist w/ uvloop and ran with it and still get a leak, see gist + aiohttp issue for plot

    @1st1
    Copy link
    Member

    1st1 commented Sep 19, 2018

    Would you be able to test uvloop/master branch? Current uvloop 0.11.x uses pretty much the asyncio implementation; the master branch has a completely rewritten SSL layer. If the master branch has the leak it might mean that the root cause is indeed in either the ssl module or openssl itself.

    @cnpeterson
    Copy link
    Mannequin

    cnpeterson mannequin commented Dec 20, 2018

    I ran the code snippet below using uvloop/master in a docker container. As it ran, the container continually leaked memory. I included a graph with the memory usage.

    Environment:
    # cat /etc/*-release
    PRETTY_NAME="Debian GNU/Linux 9 (stretch)"
    NAME="Debian GNU/Linux"
    VERSION_ID="9"
    VERSION="9 (stretch)"
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"

    # uname -r
    4.18.16-300.fc29.x86_64

    # python -V
    Python 3.7.1

    # pip freeze
    asyncio==3.4.3
    Cython==0.29.2
    idna==2.8
    multidict==4.5.2
    uvloop==0.12.0rc2
    yarl==1.3.0

    I had to tweak the code a bit to run in a docker container successfully, but here is the code I used:

    import asyncio
    import logging
    import ssl
    import socket
    import sys
    import yarl
    import uvloop

    CREDENTIAL = ''

    URLS = {
        'https://s3.us-west-2.amazonaws.com/archpi.dabase.com/style.css': {
            'method': 'get',
            'headers': {'User-Agent': 'Botocore/1.8.21 Python/3.6.4 Darwin/17.5.0', 'X-Amz-Date': '20180518T025044Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=ae552641b9aa9a7a267fcb4e36960cd5863e55d91c9b45fd39b30fdcd2e81489', 'Accept-Encoding': 'identity'}
        },
    'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist': {
        'method': 'GET' if sys.argv[1] == 'get_object' else 'HEAD',
        'headers': {'User-Agent': 'Botocore/1.8.21 [Python/3.6.4](https://github.com/python/cpython/blob/main/Python/3.6.4) Darwin/17.5.0', 'X-Amz-Date': '20180518T025221Z', 'X-Amz-Content-SHA256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': f'AWS4-HMAC-SHA256 Credential={CREDENTIAL}/20180518/ap-southeast-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=7a7675ef6d70cb647ed59e02d532ffa80d437fb03976d8246ea9ef102d118794', 'Accept-Encoding': 'identity'}
    }
    

    }

    asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
    
    class HttpClient(asyncio.streams.FlowControlMixin):
        transport = None
    
        def __init__(self, *args, **kwargs):
            self.__url = kwargs.pop('url')
            self.__logger = logging.getLogger()
            super().__init__()
    
        def connection_made(self, transport):
            self.transport = transport
    
            url_parts = yarl.URL(self.__url)
            entry = URLS[self.__url]
    
            body = f'{entry["method"]} {url_parts.path} HTTP/1.1\r\nAccept: */*\r\nHost: {url_parts.host}\r\n'
            for name, value in entry['headers'].items():
                body += f'{name}: {value}\r\n'
        body += '\r\n'
        self.transport.write(body.encode('ascii'))
        self.__logger.info(f'data sent: {body}')
    
        def data_received(self, data):
            self.__logger.info(f'data received: {data}')
    
            self.transport.close()
            # asyncio.get_event_loop().call_later(1.0, )
    
        def eof_received(self):
            self.__logger.info('eof_received')
    
        def connection_lost(self, exc):
            self.__logger.info(f'connection lost: {exc}')
            super().connection_lost(exc)
    
        @classmethod
        def create_factory(cls, url: str):
            def factory(*args, **kwargs):
                return cls(*args, url=url, **kwargs)
    
            return factory
    
    
    async def test_asyncio(ssl_context):
        loop = asyncio.get_event_loop()
    
        url = 'https://s3.ap-southeast-1.amazonaws.com/archpi.dabase.com/doesnotexist'
        url_parts = yarl.URL(url)
        port = url_parts.port or (80 if url_parts.scheme == 'http' else 443)
        infos = await loop.getaddrinfo(url_parts.host, port, family=socket.AF_INET)
        family, type, proto, canonname, sockaddr = infos[0]
        await loop.create_connection(HttpClient.create_factory(url), sockaddr[0], port, ssl=ssl_context, family=family, proto=proto, flags=socket.AI_NUMERICHOST, server_hostname=url_parts.host, local_addr=None)
    
    
    async def asyncio_test():
        ssl_context = ssl.create_default_context()
    
        while True:
            await test_asyncio(ssl_context)
    
    
    def main():
        print('running')
        loop = asyncio.get_event_loop()
        loop.run_until_complete(asyncio_test())
    
    
    main()

    @1st1
    Copy link
    Member

    1st1 commented Mar 17, 2019

    New changeset f683f46 by Yury Selivanov (Fantix King) in branch 'master':
    bpo-34745: Fix asyncio sslproto memory issues (GH-12386)
    f683f46

    @miss-islington
    Copy link
    Contributor

    New changeset 7f7485c by Miss Islington (bot) in branch '3.7':
    bpo-34745: Fix asyncio sslproto memory issues (GH-12386)
    7f7485c

    @thehesiod
    Copy link
    Mannequin Author

    thehesiod mannequin commented Mar 25, 2019

    going to close, I've verified that it fixes my original issue, ty!!

    @thehesiod thehesiod mannequin closed this as completed Mar 25, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage topic-asyncio
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants