Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard names #62180

Closed
fweimer mannequin opened this issue May 15, 2013 · 35 comments
Closed

CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard names #62180

fweimer mannequin opened this issue May 15, 2013 · 35 comments
Labels
stdlib Python modules in the Lib dir type-security A security issue

Comments

@fweimer
Copy link
Mannequin

fweimer mannequin commented May 15, 2013

BPO 17980
Nosy @malemburg, @tim-one, @birkenfeld, @gpshead, @pitrou, @vstinner, @tiran
Files
  • ssl_wildcard_dos.patch
  • ssl_wildcard_dos2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-05-18.16:00:16.484>
    created_at = <Date 2013-05-15.10:25:06.692>
    labels = ['type-security', 'library']
    title = 'CVE-2013-2099 ssl.match_hostname() trips over crafted\twildcard names'
    updated_at = <Date 2013-05-18.16:00:16.483>
    user = 'https://bugs.python.org/fweimer'

    bugs.python.org fields:

    activity = <Date 2013-05-18.16:00:16.483>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-05-18.16:00:16.484>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2013-05-15.10:25:06.692>
    creator = 'fweimer'
    dependencies = []
    files = ['30288', '30292']
    hgrepos = []
    issue_num = 17980
    keywords = ['patch']
    message_count = 35.0
    messages = ['189280', '189291', '189348', '189353', '189354', '189357', '189361', '189366', '189368', '189369', '189373', '189380', '189391', '189396', '189398', '189399', '189402', '189407', '189419', '189430', '189432', '189433', '189434', '189436', '189437', '189438', '189439', '189444', '189451', '189452', '189453', '189455', '189517', '189525', '189526']
    nosy_count = 14.0
    nosy_names = ['lemburg', 'tim.peters', 'georg.brandl', 'gregory.p.smith', 'pitrou', 'vstinner', 'christian.heimes', 'timehorse', 'Arfrever', 'iankko', 'python-dev', 'bkabrda', 'fweimer', 'mpessas']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue17980'
    versions = ['Python 3.2', 'Python 3.3', 'Python 3.4']

    @fweimer
    Copy link
    Mannequin Author

    fweimer mannequin commented May 15, 2013

    If the name in the certificate contains many "*" characters, matching the compiled regular expression against the host name can take a very long time. Certificate validation happens before host name checking, so I think this is a minor issue only because it can only be triggered in cooperation with a CA (which seems unlikely).

    The fix is to limit the number of "*" wildcards to a reasonable maximum (perhaps even 1).

    @fweimer fweimer mannequin added the stdlib Python modules in the Lib dir label May 15, 2013
    @pitrou
    Copy link
    Member

    pitrou commented May 15, 2013

    Does the RFC say anything about this? How much wildcards are necessary to take up a significant amount of CPU time?

    @iankko
    Copy link
    Mannequin

    iankko mannequin commented May 16, 2013

    The CVE identifier of CVE-2013-2099 has been assigned:
    http://www.openwall.com/lists/oss-security/2013/05/16/6

    to this issue.

    @iankko iankko mannequin changed the title ssl.match_hostname() trips over crafted wildcard names CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard names May 16, 2013
    @pitrou pitrou added the type-security A security issue label May 16, 2013
    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    This is caused by the regex engine's performance behaviour:
    http://bugs.python.org/issue1662581
    http://bugs.python.org/issue1515829
    http://bugs.python.org/issue212521

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    I would like to know what is the expected scenario:

    • does the attacker only control the certificate?
    • or does the attacker control both the certificate and the hostname being validated?

    The reason is that the matching cost for a domain name fragment seems to be O(n**k), where n is the fragment length and k is the number of wildcards. Therefore, if the attacker controls both n and k, even limiting k to 2 already allows a quadratic complexity attack.

    @tiran
    Copy link
    Member

    tiran commented May 16, 2013

    RFC 2818 doesn't say anything about the maximum amount of wildcards. I'm going to check OpenSSL's implementation now.

    @fweimer
    Copy link
    Mannequin Author

    fweimer mannequin commented May 16, 2013

    OpenSSL supports only a single wildcard character.

    In my tests, I used a host name like aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.example.org, and a dNSName like a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*.example.org. Quadratic behavior wouldn't be too bad because the host name is necessarily rather short (more than 255 characters will not pass through DNS).

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    Indeed, two wildcards seem to be ok with a 255-character domain name:

    $ ./python -m timeit -s "import ssl; cert = {'subject': ((('commonName', '*a*a.com'),),)}" "try: ssl.match_hostname(cert, 'a' * 250 +'z.com')" "except ssl.CertificateError: pass"
    1000 loops, best of 3: 797 usec per loop

    Three wildcards already start producing some load:

    $ ./python -m timeit -s "import ssl; cert = {'subject': ((('commonName', '*a*a*a.com'),),)}" "try: ssl.match_hostname(cert, 'a' * 250 +'z.com')" "except ssl.CertificateError: pass"
    10 loops, best of 3: 66.2 msec per loop

    Four wildcards are more than enough for a DoS:

    $ ./python -m timeit -s "import ssl; cert = {'subject': ((('commonName', '*a*a*a*a.com'),),)}" "try: ssl.match_hostname(cert, 'a' * 250 +'z.com')" "except ssl.CertificateError: pass"
    10 loops, best of 3: 4.12 sec per loop

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    In my tests, I used a host name like
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.example.org, and a dNSName
    like a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*.example.org.
    Quadratic behavior wouldn't be too bad because the host name is
    necessarily rather short (more than 255 characters will not pass
    through DNS).

    Hmm, but the host name doesn't necessarily come from DNS, does it?

    @pitrou pitrou changed the title CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard names CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard names May 16, 2013
    @fweimer
    Copy link
    Mannequin Author

    fweimer mannequin commented May 16, 2013

    The host name is looked up to get the IP address to connect to. The lookup will fail if the host name is longer than 255 characters, and the crafted certificate is never retrieved.

    @tiran
    Copy link
    Member

    tiran commented May 16, 2013

    I think a malicious user could abuse SNI to craft a longer host name and trigger the pathological case.

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    In GnuTLS, _gnutls_hostname_compare() (lib/gnutls_str.c) uses a trivial recursive approach with a maximum number of 5 wildcards.

    @tim-one
    Copy link
    Member

    tim-one commented May 16, 2013

    Wildcard matching can easily be done in worst-case linear time, but not with regexps. doctest.py's internal _ellipsis_match() shows one way to do it (doctest can use "..." as a wildcard marker).

    @tiran
    Copy link
    Member

    tiran commented May 16, 2013

    We could use an algorithm that doesn't need regexp for most cases.

    pseudo code:

    value = value.lower()
    hostname = hostname.lower()
    
    if '*' not in value:
       return value == hostname
    
    vparts = valuesplit(".")
    hparts = hostname.split(".")
    if len(vparts) != len(hparts):
        # * doesn't match a dot
        return False
    
    for v, h in zip(vparts, hparts):
        if v == "*":
            # match any host part
            continue
        asterisk = v.count("*")
        if asterisk == 0:
            if v != h:
                return False
        elif asterisk == 1:
            # match with simple re
        else:
            # don't support more than one * in a FQDN part
            raise TooManyAsterisk

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    Wildcard matching can easily be done in worst-case linear time, but
    not with regexps. doctest.py's internal _ellipsis_match() shows one
    way to do it (doctest can use "..." as a wildcard marker).

    Thanks, this may be a nice enhancement for 3.4.

    For 3.2 and 3.3, I'd prefer to go the safe way of simply limiting the
    number of wildcards. If OpenSSL only accepts one per fragment, accepting
    one or two is certainly fine for Python as well :-)

    @pitrou
    Copy link
    Member

    pitrou commented May 16, 2013

    Here is a patch allowing at most 2 wildcards per domain fragment. Georg, do you think this should go into 3.2?

    @birkenfeld
    Copy link
    Member

    It's certainly a security fix, but probably not one that warrants an immediate release.

    If you commit it to the 3.2 branch, that's fine, it will get picked up by coming releases.

    @gpshead
    Copy link
    Member

    gpshead commented May 16, 2013

    Indeed, doing this _without a regexp_ is preferred. :)

    @tiran
    Copy link
    Member

    tiran commented May 17, 2013

    Are multiple wildcards per fragment even specified? I'm unable to find information if the wildcard is supposed to be a greedy or a non-greedy match.

    By the way Chromium does more fancy checks. For example it requires * to match at least on character and it does special handling of IDN. X509Certificate::VerifyHostname() around line 500.

    http://src.chromium.org/viewvc/chrome/trunk/src/net/cert/x509_certificate.cc

    @vstinner
    Copy link
    Member

    Are multiple wildcards per fragment even specified?

    I don't know the standard, but it sounds strange to have more than one wildcard per part of an URL. "*.*.*.google.com" looks valid to me, whereas "*a*a*a*.google.com" looks very suspicious.

    Said differently, I expect:

    assert max(part.count("*") for part in url.split(".")) <= 1

    "*" pattern is replace with '[^.]+' regex, so I may not cause the exponential complexity issue. (I didn't check.)

    @fweimer
    Copy link
    Mannequin Author

    fweimer mannequin commented May 17, 2013

    "*" pattern is replace with '[^.]+' regex, so I may not cause the exponential complexity issue. (I didn't check.)

    A possessive quantifier might also help, that is [^.]+?.

    @malemburg
    Copy link
    Member

    SSL certificate hostname matching is defined in RFC 2818:

    It's not very verbose on how exactly matching should be done:

    """
    Names may contain the wildcard
    character * which is considered to match any single domain name
    component or component fragment. E.g., *.a.com matches foo.a.com but
    not bar.foo.a.com. f*.com matches foo.com but not bar.com.
    """

    Given that it's underspecified, I doubt that anyone using wildcards in certificates for valid purposes would risk using anything but very simply prefix/suffix matching - most certainly not any matching that would require backtracking to succeed.

    There are several variants out there of how the matching is done.
    See e.g. http://search-hadoop.com/c/Hadoop:hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/SSLHostnameVerifier.java||dns

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    Non-greedy matching actually makes things worse :-)

    $ ./python -m timeit -s "import re; pat = re.compile('\A*a*a*a\Z'.replace('*', '[^.]+'), re.IGNORECASE)" "pat.match('a' * 100 +'z')"
    100 loops, best of 3: 3.31 msec per loop
    
    $ ./python -m timeit -s "import re; pat = re.compile('\A*a*a*a\Z'.replace('*', '[^.]+?'), re.IGNORECASE)" "pat.match('a' * 100 +'z')"
    100 loops, best of 3: 6.91 msec per loop

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    Florian, I'm actually surprised by your assertion that OpenSSL supports a single wildcard character. Last I looked, I couldn't find any hostname matching function in OpenSSL (which is why I had to write our own). Could you point me to the relevant piece of code?

    @fweimer
    Copy link
    Mannequin Author

    fweimer mannequin commented May 17, 2013

    Antoine, support for OpenSSL host name matching is quite new: <http://www.openssl.org/docs/crypto/X509_check_host.html\>

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    libcurl supports a single wildcard for the whole domain name pattern (not even one per fragment), as per lib/hostcheck.c

    (this is when linked against OpenSSL; when linked against GnuTLS, curl will use the GnuTLS-provided matching function)

    Based on all the evidence, I think allowing one wildcard per fragment is sufficient.

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    Antoine, support for OpenSSL host name matching is quite new

    Ah, thanks. I was looking in 1.0.1e.

    @malemburg
    Copy link
    Member

    Here's another long discussions about SSL hostname matching that may provide some useful insights:

    Note how RFC 2595 doesn't even allow sub-string matching. It only allows '*' to be used as component.

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    Attached patch forbidding more than one wildcard per fragment.

    @tiran
    Copy link
    Member

    tiran commented May 17, 2013

    I still think that sub string wildcard should not match the IDN "xn--" prefix. With current code the rules "x*.example.de" gives a positive match for "götter.example.de".

    >>> u"götter.example.de".encode("idna")
    'xn--gtter-jua.example.de'

    @pitrou
    Copy link
    Member

    pitrou commented May 17, 2013

    I still think that sub string wildcard should not match the IDN
    "xn--" prefix. With current code the rules "x*.example.de" gives a
    positive match for "götter.example.de".

    You should open a separate issue for this (possibly with a patch).

    @tiran
    Copy link
    Member

    tiran commented May 17, 2013

    bpo-17997

    @tiran
    Copy link
    Member

    tiran commented May 18, 2013

    The IDNA RFC contains additional rules for wildcard matching ... very well hidden indead!

    http://tools.ietf.org/html/rfc6125#section-6.4.3

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 18, 2013

    New changeset b9b521efeba3 by Antoine Pitrou in branch '3.2':
    Issue bpo-17980: Fix possible abuse of ssl.match_hostname() for denial of service using certificates with many wildcards (CVE-2013-2099).
    http://hg.python.org/cpython/rev/b9b521efeba3

    New changeset c627638753e2 by Antoine Pitrou in branch '3.3':
    Issue bpo-17980: Fix possible abuse of ssl.match_hostname() for denial of service using certificates with many wildcards (CVE-2013-2099).
    http://hg.python.org/cpython/rev/c627638753e2

    New changeset fafd33db6ff6 by Antoine Pitrou in branch 'default':
    Issue bpo-17980: Fix possible abuse of ssl.match_hostname() for denial of service using certificates with many wildcards (CVE-2013-2099).
    http://hg.python.org/cpython/rev/fafd33db6ff6

    @pitrou
    Copy link
    Member

    pitrou commented May 18, 2013

    Ok, this should be fixed now. Thanks a lot for reporting!

    @pitrou pitrou closed this as completed May 18, 2013
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants