Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe open file leakage running asyncio SSL server #74156

Closed
kyuupichan mannequin opened this issue Apr 3, 2017 · 14 comments
Closed

Severe open file leakage running asyncio SSL server #74156

kyuupichan mannequin opened this issue Apr 3, 2017 · 14 comments
Assignees
Labels
3.7 (EOL) end of life performance Performance or resource usage topic-asyncio

Comments

@kyuupichan
Copy link
Mannequin

kyuupichan mannequin commented Apr 3, 2017

BPO 29970
Nosy @giampaolo, @asvetlov, @mocmocamoc, @1st1, @fafhrd91, @kyuupichan
PRs
  • bpo-29970: Time out SSL handshake if not complete after 10 seconds #4825
  • bpo-29970: Make ssh_handshake_timeout None by default #4939
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/1st1'
    closed_at = <Date 2017-12-19.20:02:01.780>
    created_at = <Date 2017-04-03.14:10:15.014>
    labels = ['expert-asyncio', '3.7', 'performance']
    title = 'Severe open file leakage running asyncio SSL server'
    updated_at = <Date 2017-12-20.19:27:29.217>
    user = 'https://github.com/kyuupichan'

    bugs.python.org fields:

    activity = <Date 2017-12-20.19:27:29.217>
    actor = 'asvetlov'
    assignee = 'yselivanov'
    closed = True
    closed_date = <Date 2017-12-19.20:02:01.780>
    closer = 'asvetlov'
    components = ['asyncio']
    creation = <Date 2017-04-03.14:10:15.014>
    creator = 'kyuupichan'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29970
    keywords = ['patch']
    message_count = 14.0
    messages = ['291071', '291135', '296294', '296297', '296298', '308084', '308146', '308673', '308676', '308765', '308769', '308784', '308785', '308786']
    nosy_count = 6.0
    nosy_names = ['giampaolo.rodola', 'asvetlov', 'mocmocamoc', 'yselivanov', 'fafhrd91', 'kyuupichan']
    pr_nums = ['4825', '4939']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue29970'
    versions = ['Python 3.7']

    @kyuupichan
    Copy link
    Mannequin Author

    kyuupichan mannequin commented Apr 3, 2017

    Original report at old repo here: python/asyncio#483

    There this is reported fixed by #480

    I wish to report that whilst the above patch might have a small positive effect, it is far from solving the actual issue. Several users report eventual exhaustion of the open file resource running SSL asyncio servers.

    Here are graphs provided by a friend running my ElectrumX server software, first accepting SSL connections and the second accepting TCP connections only. Both of the servers were monkey-patched with the pull-480 fix above, so this is evidence it isn't solving the issue.

    http://imgur.com/a/cWnSu

    As you can see, the TCP server (which has far less connections; most users use SSL) has no leaked file handles, whereas the SSL server has over 300.

    This becomes an easy denial of service vector against asyncio servers. One way to trigger this (though I doubt it explains the numbers above) is simply to connect to the SSL server from telnet, and do nothing. asyncio doesn't time you out, the telnet session seems to sit there forever, and the open file resources are lost in the SSL handshake stage until the remote host kindly decides to disconnect.

    I suspect these resource issues all revolve around the SSL handshake process, certainly at the opening of a connection, but also perhaps when closing.

    As the application author I am not informed by asyncio of a potential connection until the initial handshake is complete, so I cannot do anything to close these phantom socket connections. I have to rely on asyncio to be properly handling DoS issues and it is not currently doing so robustly.

    @kyuupichan kyuupichan mannequin added topic-asyncio performance Performance or resource usage labels Apr 3, 2017
    @1st1
    Copy link
    Member

    1st1 commented Apr 4, 2017

    I'm assigning this to myself to make sure I don't forget about this. If someone wants to tackle this please feel free to reassign.

    @1st1 1st1 added the 3.7 (EOL) end of life label Apr 4, 2017
    @1st1 1st1 self-assigned this Apr 4, 2017
    @fafhrd91
    Copy link
    Contributor

    question is, should asyncio handle timeouts or leave it to caller?

    #480 fixes leak during handshake.

    @kyuupichan
    Copy link
    Mannequin Author

    kyuupichan mannequin commented Jun 18, 2017

    @nikolay Kim

    As I note in the original submission, 480 was tested and does NOT solve this issue. Thanks.

    @fafhrd91
    Copy link
    Contributor

    I see. this is server specific problem. as a temp solution I'd use proxy for ssl termination.

    @kyuupichan
    Copy link
    Mannequin Author

    kyuupichan mannequin commented Dec 12, 2017

    I'm not sure what you mean about this being a server-specific problem. It's clearly a bug in the asyncio SSL wrapper as using TCP instead of SSL with otherwise identical code doesn't leak open files.

    @mocmocamoc
    Copy link
    Mannequin

    mocmocamoc mannequin commented Dec 12, 2017

    I think there's been some confusion about what PR 480 was meant to fix - it helps in cases where connections are closed during handshake, but if a server connection is waiting for a handshake but never receives any data at all then it stays in that state forever.

    As for a fix, how about giving SSLProtocol a method like:

        def checkHandshakeDone(self):
            if self._in_handshake == True:
                self._abort()

    and then at the end of _start_handshake() adding:

        self._loop.call_later(10, self.checkHandshakeDone)

    Then if the handshake is not complete within ten seconds of starting, the connection will be aborted.

    @asvetlov
    Copy link
    Contributor

    New changeset f7686c1 by Andrew Svetlov (Neil Aspinall) in branch 'master':
    bpo-29970: Add timeout for SSL handshake in asyncio
    f7686c1

    @asvetlov
    Copy link
    Contributor

    Fixed in 3.7

    @asvetlov
    Copy link
    Contributor

    New changeset 51eb1c6 by Andrew Svetlov in branch 'master':
    bpo-29970: Make ssh_handshake_timeout None by default (bpo-4939)
    51eb1c6

    @1st1
    Copy link
    Member

    1st1 commented Dec 20, 2017

    Should we backport this to 3.6? This is a security issue.

    @asvetlov
    Copy link
    Contributor

    The fix introduces a new parameter in public API.

    That's why I think we shouldn't backport it.

    @1st1
    Copy link
    Member

    1st1 commented Dec 20, 2017

    The fix introduces a new parameter in public API.

    Maybe we can get away with this if we do not document it in 3.6 and add a comment to the source code that using this new parameter will make the code incompatible with earlier 3.6.x versions?

    @asvetlov
    Copy link
    Contributor

    Don't know.
    Ask other coredevs maybe?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage topic-asyncio
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants