New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ssl.match_hostname(): sub string wildcard should not match IDNA prefix #62197
Comments
Python's ssl.match_hostname() does sub string matching as specified in RFC 2818: Names may contain the wildcard The RFC doesn't specify how internationalized domain names shoould be handled because it predates RFC 5890 for IDNA by many year. IDNA are prefixed with "xn--", e.g. u"götter.example.de".encode("idna") == Chrome has special handling for IDN prefix in X509Certificate::VerifyHostname() Also see bpo-17980 |
Actually, I don't this is a bug: match_hostname() expects str data, and therefore IDNA-decoded domain names: >>> b"xn--gtter-jua.example.de".decode("idna")
'götter.example.de' Doing the matching on the decoded domain name should be safe. |
It's called "internationalized domain name for APPLICATIONS". ;) It's up to the application to interpret the ASCII text as IDNA encoded FQDNs. As far as I know DNS, SSL's CNAME and OS interfaces etc. always use ASCII labels. It's an elegant solution. Just the UI part of an application needs to understand IDNA. http://tools.ietf.org/html/rfc6125#section-6.4.2 If the DNS domain name portion of a reference identifier is an Coincidentally the same RFC contains matching rules for wild card certs If a client matches the reference identifier against a presented
|
The socket module already decodes to/encodes from IDNA in places (e.g. gethostname()). We need a consistent policy in the stdlib; I would like Martin's advice on this. |
I finally found the correct RFC for wildcard matching. I think our implementation violates some recommendations. http://tools.ietf.org/html/rfc6125#section-6.4.2 |
As a policy, the standard library should accept non-ASCII host names ("U-labels") wherever possible. I.e the hostname parameter of match_hostname should allow for U-labels (as well as A-labels). When returning names, it should always return the data "as-is", which typically means A-labels. Anybody wanting to display U-labels will need to decode them explicitly. I believe that the matching of IDNA names doesn't currently happen according to 6.4.2 of RFC 6125, however, this is not actually the issue that Christian reported (which was only about wildcard matching). I suggest to create a separate issue for that. As for 6.4.3: I find the text to be quite ill-formulated. Specifically, I'm referring to the sentence
First, in the context of X.509, a wildcard *cannot* be embedded "with an ... U-label"; the certificate can only possibly contain A-labels (because the datatype of dNSName is IA5String). Second, as written, it *does* allow to match 'götter.example.de' against "x*.example.de", since "x*.example.de" is not an A-label. An A-label is defined as
Since an A-label is required to conform to the LDH label syntax, it cannot possibly contain the asterisk (LDH labels can only contain letters, digits, and the hyphen. Hence, the entire requirement is irrelevant (as literally written). They might mean something else, but I cannot guess what it is that they mean. I disagree with the classification of this issue as critical. It does not involve a crash, a serious regression, or a breakage of a very important API. |
Ryan Sleevi of the Google Chrome Security Team has informed us about another issue that is caused by our failure to implement RFC 6125 wildcard matching rules. RFC 6125 allows only one wildcard in the left-most fragment of a hostname. For security reasons matching rules like *.*.com should be not supported. For wildcards in internationalized domain names I have followed the piece of advice "In the face of ambiguity, refuse the temptation to guess.". A substring wildcard does no longer match an IDN A-label fragment. '*' still matches a full punycode fragment but 'x*' no longer matches 'xn--foo'. I copied the idea from Chrome's matching code: http://src.chromium.org/viewvc/chrome/trunk/src/net/cert/x509_certificate.cc?revision=212341#l640
The relevant RFC section for the patch are http://tools.ietf.org/html/rfc6125#section-6.4.3 |
Affected versions:
|
So, is this a security issue? I've been wondering if I should apply the attached patch to the backports-ssl_match_hostname module on pypi. I was hoping there'd be some information here as to whether this will be going into the stdlib in the future. Thus far, ssl_match_hostname has just been a backport of the match_hostname function but if this is a security problem, I could press for us to diverge from the python3 stdlib. It would be easier to make the case if this is seen as a critical problem that will need to be fixed even if the current patch might not be the eventual fix. |
Yes, it's a security issue. But the patch would changes the behavior of the function. The current function conforms to RFC 2818. The patch implements RFC 6125, which is more restrictive. |
New changeset 10d0edadbcdd by Georg Brandl in branch '3.3': |
Also merged to default. |
Python 3.2 hasn't been fixed yet. Should acquire a CVE for the issue? |
Just to clarify the status of this issue: it *only* blocks 3.2. |
For future reference how do I find out if this has been applied to 3.2? |
Since it's been out in 3.2.x for so long, I won't apply this for 3.2 since at this point a behavior change might do more harm than good. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: