This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: New SSL module doesn't seem to verify hostname against commonName in certificate
Type: security Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ryan.Tucker, Thomas.Leonard, ahasenack, asdfasdfasdfasdfasdfasdfasdf, debatem1, devin, eric.araujo, giampaolo.rodola, heikki, janssen, jcea, jsamuel, kiilerix, orsenthil, pitrou, techtonik, vila, zooko
Priority: normal Keywords: patch

Created on 2007-12-11 15:41 by ahasenack, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
verisign-inc-class-3-public-primary.pem ahasenack, 2007-12-11 15:41
sslcheck.patch pitrou, 2010-10-04 16:37
sslcheck2.patch pitrou, 2010-10-06 20:32
Messages (47)
msg58434 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-11 15:41
(I hope I used the correct component for this report)

http://pypi.python.org/pypi/ssl/

I used the client example shown at
http://docs.python.org/dev/library/ssl.html#client-side-operation to
connect to a bank site called www.realsecureweb.com.br at
200.208.16.101. Its certificate signed by verisign. My OpenSSL has this
CA at /etc/pki/tls/rootcerts/verisign-inc-class-3-public-primary.pem.
The verification works.

If I make up a hostname called something else, like "wwws", and place it
in /etc/hosts pointing to that IP address, the SSL connection should not
be established because that name doesn't match the common name field in
the server certificate. But the SSL module happily connects to it
(excerpt below):

cert = verisign-inc-class-3-public-primary.pem
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(s,
           ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
           cert_reqs=ssl.CERT_REQUIRED)
ssl_sock.connect(('wwws', 443))
print repr(ssl_sock.getpeername())

output:
('200.208.16.101', 443)
('RC4-MD5', 'TLSv1/SSLv3', 128)
{'notAfter': 'Sep 10 23:59:59 2008 GMT',
 'subject': ((('countryName', u'BR'),),
             (('stateOrProvinceName', u'Sao Paulo'),),
             (('localityName', u'Sao Paulo'),),
             (('organizationName', u'Banco ABN AMRO Real SA'),),
             (('organizationalUnitName', u'TI Internet PF e PJ'),),
             (('commonName', u'www.realsecureweb.com.br'),))}

If I now open, say, a firefox window and point it to "https://wwws", it
gives me the expected warning that the hostname doesn't match the
certificate.

I'll attach the verisign CA certificate to make it easier to reproduce
the error.
msg58444 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-11 17:41
Bill, can you respond?
msg58472 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2007-12-11 22:40
Unfortunately, hostname matching is one of those ideas that seemed
better when it was thought up than it actually proved to be in practice.
 I've had extensive experience with this, and have found it to almost
always an application-specific decision.  I thought about this when
designing the client-side verification, and couldn't see any automatic
solution that doesn't get in the way.  So the right way to do this with
Python is to write some application code that looks at the verified
identity and makes decisions based on whatever authentication algorithm
you need.
msg58491 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-12 12:48
At the least it should be made clear in the documentation that the
hostname is not checked against the commonName nor the subjectAltName
fields of the server certificate. And add some sample code to the
documentation for doing a simple check. Something like this, to illustrate:

def get_subjectAltName(cert):
        if not cert.has_key('subjectAltName'):
                return []
        ret = []
        for rdn in cert['subjectAltName']:
                if rdn[0].lower() == 'dns' or rdn[0][:2].lower() == 'ip':
                        ret.append(rdn[1])
        return ret

def get_commonName(cert):
        if not cert.has_key('subject'):
                return []
        ret = []
        for rdn in cert['subject']:
                if rdn[0][0].lower() == 'commonname':
                        ret.append(rdn[0][1])
        return ret


def verify_hostname(cert, host):
        cn = get_commonName(cert)
        san = get_subjectAltName(cert)
        return (host in cn) or (host in san)
msg58508 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2007-12-12 20:36
Yes, I think that's reasonable.  And for pseudo-standards like https, which
calls for this, the implementation in the standard library should attempt to
do it automatically.  Unfortunately, that means that client-side certificate
verification has to be done (it's pointless to look at the data in
unverified certificates), and that means that the client software has to
have an appropriate collection of root certificates to verify against.  I
think there's an argument for adding a registry of root certificates to the
SSL module, just a module-level variable that the application can bind to a
filename of a file containing their collection of certificates.  If it's
non-None, the https code would use it to verify the certificate, then use
the commonName in the subject field to check against the hostname in the
URL.  If it's None, the check would be skipped.

Bill

On Dec 12, 2007 4:48 AM, Andreas Hasenack <report@bugs.python.org> wrote:

>
> Andreas Hasenack added the comment:
>
> At the least it should be made clear in the documentation that the
> hostname is not checked against the commonName nor the subjectAltName
> fields of the server certificate. And add some sample code to the
> documentation for doing a simple check. Something like this, to
> illustrate:
>
> def get_subjectAltName(cert):
>        if not cert.has_key('subjectAltName'):
>                return []
>        ret = []
>        for rdn in cert['subjectAltName']:
>                if rdn[0].lower() == 'dns' or rdn[0][:2].lower() == 'ip':
>                        ret.append(rdn[1])
>        return ret
>
> def get_commonName(cert):
>        if not cert.has_key('subject'):
>                return []
>        ret = []
>        for rdn in cert['subject']:
>                if rdn[0][0].lower() == 'commonname':
>                        ret.append(rdn[0][1])
>        return ret
>
>
> def verify_hostname(cert, host):
>        cn = get_commonName(cert)
>        san = get_subjectAltName(cert)
>        return (host in cn) or (host in san)
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> __________________________________
>
msg58535 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-13 15:14
> do it automatically.  Unfortunately, that means that client-side
certificate
> verification has to be done (it's pointless to look at the data in
> unverified certificates), and that means that the client software has to
> have an appropriate collection of root certificates to verify against.  I

But the current API already has this feature:
ssl_sock = ssl.wrap_socket(s, ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
                      cert_reqs=ssl.CERT_REQUIRED)

So this is already taken care of with ca_certs and cert_reqs, right?
msg58547 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2007-12-13 18:10
The mechanism is there for direct use of the SSL module, yes.  But the
question is, what should indirect usage, like the httplib or urllib modules,
do?  If they are going to check hostnames on use of an https: URL, they need
some way to pass a ca_certs file through to the SSL code they use.

Bill

On Dec 13, 2007 7:14 AM, Andreas Hasenack <report@bugs.python.org> wrote:

>
> Andreas Hasenack added the comment:
>
> > do it automatically.  Unfortunately, that means that client-side
> certificate
> > verification has to be done (it's pointless to look at the data in
> > unverified certificates), and that means that the client software has to
> > have an appropriate collection of root certificates to verify against.
>  I
>
> But the current API already has this feature:
> ssl_sock = ssl.wrap_socket(s, ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
>                      cert_reqs=ssl.CERT_REQUIRED)
>
> So this is already taken care of with ca_certs and cert_reqs, right?
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> __________________________________
>
msg71405 - (view) Author: Heikki Toivonen (heikki) Date: 2008-08-19 03:21
I would definitely recommend providing as strict as possible hostname
verification in the stdlib, but provide application developers a way to
override that.

M2Crypto (and TLS Lite, from which I copied the approach to M2Crypto),
provide a default post connection checker. See
http://svn.osafoundation.org/m2crypto/trunk/M2Crypto/SSL/Connection.py
and the set_post_connection_check_callback() as well as
http://svn.osafoundation.org/m2crypto/trunk/M2Crypto/SSL/Checker.py.
msg71443 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-08-19 16:45
Nope.  Hostname verification was never a good idea -- the "hostname" is
just a vague notion, at best -- lots of hostnames can map to one or more
IP addresses of the server.  It's exposed to the application code, so if
a client application wants to do it, it can.  But I recommend against
it.  It's a complication that doesn't belong in the basic support, that
is, the SSL module.  I'll add a note to this effect in the documentation.
msg71586 - (view) Author: Heikki Toivonen (heikki) Date: 2008-08-20 22:52
I would think most people/applications want to know to which host they
are talking to. The reason I am advocating adding a default check to the
stdlib is because this is IMO important for security, and it is easy to
get it wrong (I don't think I have it 100% correct in M2Crypto either,
although I believe it errs on the side of caution). I believe it would
be a disservice to ship something that effectively teaches developers to
ignore security (like the old socket.ssl does).

A TLS extension also allows SSL vhosts, so static IPs are no longer
strictly necessary (this is not universally supported yet, though).
msg71682 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-08-21 21:12
checking hostnames is false security, not real security.

On 8/20/08, Heikki Toivonen <report@bugs.python.org> wrote:
>
>  Heikki Toivonen <hjtoi-bugzilla@comcast.net> added the comment:
>
>
> I would think most people/applications want to know to which host they
>  are talking to. The reason I am advocating adding a default check to the
>  stdlib is because this is IMO important for security, and it is easy to
>  get it wrong (I don't think I have it 100% correct in M2Crypto either,
>  although I believe it errs on the side of caution). I believe it would
>  be a disservice to ship something that effectively teaches developers to
>  ignore security (like the old socket.ssl does).
>
>  A TLS extension also allows SSL vhosts, so static IPs are no longer
>  strictly necessary (this is not universally supported yet, though).
>
>
>  _______________________________________
>  Python tracker <report@bugs.python.org>
>  <http://bugs.python.org/issue1589>
>  _______________________________________
>
msg72574 - (view) Author: Heikki Toivonen (heikki) Date: 2008-09-05 07:11
Could you clarify your comment regarding hostname check being false
security?

Just about all SSL texts I have read say you must do that, and that is
what your web browser and email client does to ensure it is talking to
the right host, for example. Without that check you are subject to a man
in the middle attack. Or is there some other check you perform that is
better?
msg72935 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-09-10 02:21
Sorry to be so brief there -- I was off on vacation.

Verifying hostnames is a prescription that someone (well, OK, Eric
Rescorla, who knows what he's talking about) put in the https IETF RFC
(which, by the way, is only an informational RFC, not standards-track).
 It's a good idea if you're a customer trying to talk to Wells-Fargo,
say, over an https connection, but isn't suitable for all https traffic.
 I support putting it in the httplib Https class by default, but there
should be a way to override it, as there is with the Java APIs for https
connections.  (Take a look at javax.net.ssl.HostnameVerifier; one of the
more popular Java classes at PARC is a version of this that verifies any
hostname).

So what's wrong with it?  There are two problems.  The first is that
certificates for services are all about the hostname, and that's just
wrong.  You should verify the specific service, not just the hostname. 
    So a client that really cares about what they are talking to should
have the certificate for that service, and verify that it is the service
it's talking to, and ignore the hostname in the URL.

But the larger problem is that hostnames are a DNS construct for humans,
and not really well supported on computers, or by the services that run
on those computers.  Most computers have only the haziest notion of what
their hostname is, and many have lots of different hostnames (my laptop
has at least five hostnames that I know of, all meaning the same
computer, but with five different PARC IP addresses).  So the services
running on that computer aren't real clear about their hostnames,
either.  If I run a service on that computer that I secure with SSL, so
that packets going over my WiFi are encrypted, which hostname should
that service declare itself to be in the certificate?  And the services
on that computer keep running, even when it switches its IP address (and
thus its set of hostnames).  So doing hostname matching provokes lots of
false negatives, especially when it's not needed.  I think it by and
large isn't a good idea, though I support having it (in an overrideable
form) for the client-side https class in httplib.

This is all exacerbated by the fact that HTTP isn't what it was when
Eric wrote that RFC eight years ago.  The growth of Web 2.0 and
"RESTful" services means that lots of new things are using https in a
much less formal way, more to get encrypted packets than to verify
endpoints.  So false negatives caused by mindless hostname verification
cause real damage.
msg73006 - (view) Author: Heikki Toivonen (heikki) Date: 2008-09-11 06:24
Ok, thank you for clarifications. Now I understand why the hostname
checking isn't the solution that fits every problem. I am still not
completely clear how you'd do the checking otherwise, for example to
verify the service you are talking to is what you think it is.

But still, I think dealing with email servers is another common use case
where hostname check is adequate most of the time. I am sure there are
other cases like this. Therefore I am still of the opinion that the
default should be to do the hostname check. Yes, make it overridable,
but doing the check is safer than not doing any checking IMO because
even if the check is incorrect for a certain purpose the developer is
likely to notice an error quickly and inclined to do some other security
check instead of not doing anything and thinking they have a secure system.

If you want to continue the discussion, we should maybe take this to
some other forum, like comp.lang.python.
msg73036 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2008-09-11 16:04
I think that, where it's appropriate, you can do that.  Just don't put it in
the SSL module.

Bill

On Wed, Sep 10, 2008 at 11:24 PM, Heikki Toivonen <report@bugs.python.org>wrote:

>
> Heikki Toivonen <hjtoi-bugzilla@comcast.net> added the comment:
>
> Ok, thank you for clarifications. Now I understand why the hostname
> checking isn't the solution that fits every problem. I am still not
> completely clear how you'd do the checking otherwise, for example to
> verify the service you are talking to is what you think it is.
>
> But still, I think dealing with email servers is another common use case
> where hostname check is adequate most of the time. I am sure there are
> other cases like this. Therefore I am still of the opinion that the
> default should be to do the hostname check. Yes, make it overridable,
> but doing the check is safer than not doing any checking IMO because
> even if the check is incorrect for a certain purpose the developer is
> likely to notice an error quickly and inclined to do some other security
> check instead of not doing anything and thinking they have a secure system.
>
> If you want to continue the discussion, we should maybe take this to
> some other forum, like comp.lang.python.
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> _______________________________________
>
msg107895 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-06-15 21:17
Reopening. I think it would be nice to provide the appropriate convenience function(s) as part of the ssl module, even if the user has to call them explicitly.
msg117606 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-09-29 11:52
Removed this message by mistake.

Author	ahasenack
Date	2007-12-11.21:11:53

Ups, typo in the script:
cert = "verisign-inc-class-3-public-primary.pem"
msg117613 - (view) Author: david (asdfasdfasdfasdfasdfasdfasdf) Date: 2010-09-29 14:39
Welcome to 2010.
SSL shouldn't be difficult to use anymore or support in python applications. But yet, until the changes in http://bugs.python.org/issue9983 was fixed python devs were using modules without any warning of the security implications. pycurl works ... but a *LOT* of coders are not using pycurl. 

Today they are still getting it wrong and are still vulnerable to mitm attacks against https on the client side.

I have an example in fairly large open source project:
bzr --> (by default due to a dependency failure ... on not depending on pycurl).  
https://bugs.edge.launchpad.net/ubuntu/+source/checkbox/+bug/625076


Less large:
libcloud http://github.com/apache/libcloud/issues/issue/2
linode-python http://github.com/tjfontaine/linode-python/issues/issue/1

I would *very* much like to see these methods fixed by default.
You can talk about how the ssl protocol is not secure because of ca's handling certificates poorly, but until you *actually* perform proper validation you cannot say these things imho. 

I can keep on looking at python projects and reporting these issues but it is really easy, just look at anything that says and is important that mitm isn't possible against it -> then check the deps. in ubuntu /debian and pick the ones that don't use pycurl, check they don't validate the common name etc. and then you have a bunch of mitm'able apps probably ;)
msg117635 - (view) Author: Zooko O'Whielacronx (zooko) Date: 2010-09-29 18:03
This appears to be a concern for some people. Maybe the builtin ssl module should be deprecated if there isn't a lot of manpower to maintain it and instead the well-maintained pyOpenSSL package should become the recommended tool?

Here is a letter that I just received, in my role as a developer of Tahoe-LAFS, from a concerned coder who doesn't know much about Python:

> An FYI on Python.
> 
> I'm not sure how businesses handle this (I've always worked in Windows
> shops), but I imagine some might consider pulling Python until it is
> properly secured. Pulling Python might affect Tahoe, which I would
> like to see do well.

Here is my reply to him:

> Thanks for the note warning me about this issue! I appreciate it.
> 
> The Tahoe-LAFS project doesn't use the builtin "ssl" module that comes
> with the Python Standard Library and instead uses the separate
> pyOpenSSL package (and uses the separate Twisted package for HTTP and
> other networking protocols). Therefore this isn't an issue for
> Tahoe-LAFS. I agree that it is potentially a "marketing" issue in that
> people might mistakenly think that Tahoe-LAFS is vulnerable or might,
> as you suggest, blacklist Python as such and thus hit Tahoe-LAFS as
> collateral damage. There's not much I can do about that from the
> perspective of a Tahoe-LAFS developer. From the perspective of 
> contributor to Python, I'm also not sure what to do, except perhaps to
> complain. :-) I guess I'll try to stir the waters a bit by suggesting
> that Python should deprecate the builtin "ssl" module and recommend
> the pyOpenSSL package instead.
msg117636 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-29 18:34
> Here is a letter that I just received, in my role as a developer of
> Tahoe-LAFS, from a concerned coder who doesn't know much about Python:
> 
> > An FYI on Python.
> > 
> > I'm not sure how businesses handle this (I've always worked in
> Windows
> > shops), but I imagine some might consider pulling Python until it is
> > properly secured. Pulling Python might affect Tahoe, which I would
> > like to see do well.

That sounds like an inventively outrageous kind of FUD. It's the first
time I hear of someone writing to third-party library authors in order
to pressure them to pressure the maintainers of a programming language
implementation to make some "decisions".

By the way, if "businesses" are really concerned about the security
problems induced by this issue, they can sponsor the effort to get the
bug fixed. It shouldn't be a lot of work.

> This appears to be a concern for some people. Maybe the builtin ssl
> module should be deprecated if there isn't a lot of manpower to
> maintain it and instead the well-maintained pyOpenSSL package should
> become the recommended tool?

Correct me if I'm wrong, but the "well-maintained pyOpenSSL package"
doesn't have the missing functionality (hostname checking in server
certificates), either. M2Crypto has it, though.
msg117639 - (view) Author: Devin Cook (devin) Date: 2010-09-29 18:42
> Correct me if I'm wrong, but the "well-maintained pyOpenSSL
> package" doesn't have the missing functionality (hostname
> checking in server certificates), either.

I'm pretty sure it's just a wrapper around the openssl library, which does not include it. That was Bill Janssen's argument for why the ssl module shouldn't do that verification. Well, that and the fact that there's no finalized standard for it yet. I believe this is the latest draft:
http://tools.ietf.org/html/draft-saintandre-tls-server-id-check-09
msg117640 - (view) Author: geremy condra (debatem1) Date: 2010-09-29 18:45
On Wed, Sep 29, 2010 at 11:34 AM, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> Here is a letter that I just received, in my role as a developer of
>> Tahoe-LAFS, from a concerned coder who doesn't know much about Python:
>>
>> > An FYI on Python.
>> >
>> > I'm not sure how businesses handle this (I've always worked in
>> Windows
>> > shops), but I imagine some might consider pulling Python until it is
>> > properly secured. Pulling Python might affect Tahoe, which I would
>> > like to see do well.
>
> That sounds like an inventively outrageous kind of FUD. It's the first
> time I hear of someone writing to third-party library authors in order
> to pressure them to pressure the maintainers of a programming language
> implementation to make some "decisions".

Not to add fuel to the fire, but I've had a user report this behavior
as a bug as well, so this isn't entirely outside the scope of
plausibility to me.

> By the way, if "businesses" are really concerned about the security
> problems induced by this issue, they can sponsor the effort to get the
> bug fixed. It shouldn't be a lot of work.

What would the approximate cost on that be, do you think? My
understanding was that the code was pretty much written given John
Nagle's patch and M2Crypto.

Geremy Condra
msg117641 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-29 18:46
> > Correct me if I'm wrong, but the "well-maintained pyOpenSSL
> > package" doesn't have the missing functionality (hostname
> > checking in server certificates), either.
> 
> I'm pretty sure it's just a wrapper around the openssl library, which
> does not include it. That was Bill Janssen's argument for why the ssl
> module shouldn't do that verification. Well, that and the fact that
> there's no finalized standard for it yet. I believe this is the latest
> draft:
> http://tools.ietf.org/html/draft-saintandre-tls-server-id-check-09

Well, to be clear, it shouldn't be done *automatically*. But providing a
helper function that implements the feature and lets higher layers like
http.client and urllib.request call it if desired would be more than
reasonable.

(openssl may not provide such a function, but gnutls does, by the way)
msg117643 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-29 19:00
> What would the approximate cost on that be, do you think? My
> understanding was that the code was pretty much written given John
> Nagle's patch and M2Crypto.

To err on the safe side and account for integration work (unit tests,
coding style, and use in http.client / urllib), I would say a couple of
days. Also because it's rather boring code :-)

(but, don't assume that urllib will then be secure by default; Python
doesn't ship with CA certificates, so existing code will still need a
bit of work to activate cert validation and pass the location of the
system's CA certs)
msg117647 - (view) Author: david (asdfasdfasdfasdfasdfasdfasdf) Date: 2010-09-29 20:12
imho it would be nice to be 'secure by default' in say the next python stable releases... (or perhaps only 3.X ? ).
msg117936 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-10-04 00:52
I added some extra verification to Mercurial (http://www.selenic.com/hg/rev/f2937d6492c5). Feel free to use the following under the Python license in Python or elsewhere. It could be a separate method/function or it could integrated in wrap_socket and controlled by a keyword. I would appreciate if you find the implementation insufficient or incorrect.

The purpose with this function is to verify if the received and validated certificate matches the host we intended to connect to.

I try to keep it simple and to fail to the safe side. 

"Correct" subjectAltName handling seems not to be feasible.

Are CRLs checked by the SSL module? Otherwise it deserves a big fat warning.

(I now assume that notBefore is handled by the SSL module and shouldn't be checked here.)

def _verifycert(cert, hostname):
    '''Verify that cert (in socket.getpeercert() format) matches
    hostname and is valid at this time. CRLs and subjectAltName are
    not handled.
    
    Returns error message if any problems are found and None on success.
    '''
    if not cert:
        return _('no certificate received')
    notafter = cert.get('notAfter')
    if notafter and time.time() > ssl.cert_time_to_seconds(notafter):
        return _('certificate expired %s') % notafter
    dnsname = hostname.lower()
    for s in cert.get('subject', []):
        key, value = s[0]
        if key == 'commonName':
            certname = value.lower()
            if (certname == dnsname or
                '.' in dnsname and
                certname == '*.' + dnsname.split('.', 1)[1]):
                return None
            return _('certificate is for %s') % certname
    return _('no commonName found in certificate')


def check(a, b):
    if a != b:
        print (a, b)

# Test non-wildcard certificates        
check(_verifycert({'subject': ((('commonName', 'example.com'),),)}, 'example.com'),
    None)
check(_verifycert({'subject': ((('commonName', 'example.com'),),)}, 'www.example.com'),
    'certificate is for example.com')
check(_verifycert({'subject': ((('commonName', 'www.example.com'),),)}, 'example.com'),
    'certificate is for www.example.com')

# Test wildcard certificates
check(_verifycert({'subject': ((('commonName', '*.example.com'),),)}, 'www.example.com'),
    None)
check(_verifycert({'subject': ((('commonName', '*.example.com'),),)}, 'example.com'),
    'certificate is for *.example.com')
check(_verifycert({'subject': ((('commonName', '*.example.com'),),)}, 'w.w.example.com'),
    'certificate is for *.example.com')

# Avoid some pitfalls
check(_verifycert({'subject': ((('commonName', '*.foo'),),)}, 'foo'),
    'certificate is for *.foo')
check(_verifycert({'subject': ((('commonName', '*o'),),)}, 'foo'),
    'certificate is for *o')

import time
lastyear = time.gmtime().tm_year - 1
nextyear = time.gmtime().tm_year + 1
check(_verifycert({'notAfter': 'May  9 00:00:00 %s GMT' % lastyear}, 'example.com'),
    'certificate expired May  9 00:00:00 %s GMT' % lastyear)
check(_verifycert({'notAfter': 'Sep 29 15:29:48 %s GMT' % nextyear, 'subject': ()}, 'example.com'),
    'no commonName found in certificate')
check(_verifycert(None, 'example.com'),
    'no certificate received')
msg117942 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-04 10:37
Hello,

> I added some extra verification to Mercurial
> (http://www.selenic.com/hg/rev/f2937d6492c5). Feel free to use the
> following under the Python license in Python or elsewhere. It could be
> a separate method/function or it could integrated in wrap_socket and
> controlled by a keyword. I would appreciate if you find the
> implementation insufficient or incorrect.

Thank you, I'll take a look!

> Are CRLs checked by the SSL module? Otherwise it deserves a big fat
> warning.

They are not, but AFAIK most browsers don't check CRLs either...
(or, rather they don't download updated CRLs)

> (I now assume that notBefore is handled by the SSL module and
> shouldn't be checked here.)

I can't say for sure, but OpenSSL seems to handle both notBefore and
notAfter as part of its cert verification routine (see interval_verify()
and cert_check_time() in crypto/x509/x509_vfy.c).
msg117963 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-04 16:37
Here is a patch against py3k. It adds a single ssl.match_hostname method, with rules from RFC 2818 (that is, tailored for HTTPS). Review welcome.
msg117968 - (view) Author: Devin Cook (devin) Date: 2010-10-04 17:08
I think it looks good except for the wildcard checking. According to the latest draft of that TLS id-checking RFC, you aren't supposed to allow the wildcard as part of a fragment. Of course this contradicts RFC 2818.

http://tools.ietf.org/html/draft-saintandre-tls-server-id-check-09#section-4.4.3

If this gets accepted, I'll submit a patch to http.client and urllib that makes use of it.
msg117970 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-04 17:19
> I think it looks good except for the wildcard checking. According to
> the latest draft of that TLS id-checking RFC, you aren't supposed to
> allow the wildcard as part of a fragment. Of course this contradicts
> RFC 2818.

Well, since it is then an "error" (according to the id-checking draft)
in the certificate itself rather than the hostname we are trying to
match, it seems there would be no real issue in accepting the match
anyway. It's up to CAs to make sure that certificates conform to
whatever standard is currently in effect.

I'm also assuming RFC 2818 is in wider use than the id-checking draft;
am I wrong?
msg117972 - (view) Author: Devin Cook (devin) Date: 2010-10-04 17:39
> I'm also assuming RFC 2818 is in wider use than the id-checking draft;
> am I wrong?

Yeah, since RFC 2818 has been accepted since 2000 and the id-checking draft was started in 2009, I'd say it's a safe bet. I'm in no way authoritative though.
msg118068 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-06 15:29
If nobody objects, I will commit this (with docs) soon. Then I will open a separate issue for the http.client / urllib.request integration, since the discussion is already quite long here.
msg118073 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-10-06 18:32
I'm sorry to make the discussion longer ...


From a Python user/programmers point of view it would be nice if http://docs.python.org/library/ssl.html also clarified what "validation" means (apparently that the cert chain all the way from one of ca_certs is valid and with valid dates, except that CRLs not are checked?). It could perhaps be mentioned next to the ca_certs description. It would also be nice to see an example with subjectAltName, both with DNS and IP entries.

Has it been tested that the way Python uses OpenSSL really checks both notBefore and notAfter?


Some comments to the patch. Some of them are just thoughts that can be ignored, but I think some of them are relevant.

_dnsname_to_pat:

AFAICS * shouldn't match the empty string. I would expect "fail(cert, '.a.com')".

I would prefer to fail to the safe side and only allow a left-most wildcard and thus not allow multiple or f* wildcards, just like draft-saintandre-tls-server-id-check-09 suggests.

I would prefer to not use re for such an important task where clarity and correctness is so important. If we only allow left-most wildcards it shouldn't be necessary.

match_hostname:

I don't understand "IP addresses are not accepted for hostname". I assume that if commonName specifies an IP address then a hostname with this address is valid. So isn't it more that "subjectAltName iPAddress isn't supported"? But wouldn't it be better and simpler to simply support iPAddress - either as the only check iff hostname "looks" like an IP address, alternatively in all cases check against both DNS and IP entries?

"dnsnames" doesn't say much about what it is. Perhaps "unmatched"?

"if san: ... else: ..." would perhaps be a bit clearer.

"doesn't match with either of (%s)" ... isn't the paranthesis around the list elements too pythonic for a message intended for end users?

Separate error messages for subjectAltName and commonName could be helpful.

I assume it should be "no appropriate _commonName_" to match "subjectAltName".

test:

cert for example.com is defined twice.


Finally:

How about unicode and/or IDN hostnames?
msg118076 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-06 19:09
> From a Python user/programmers point of view it would be nice if
> http://docs.python.org/library/ssl.html also clarified what
> "validation" means (apparently that the cert chain all the way from
> one of ca_certs is valid and with valid dates, except that CRLs not
> are checked?). It could perhaps be mentioned next to the ca_certs
> description. It would also be nice to see an example with
> subjectAltName, both with DNS and IP entries.

As mentioned in the patch, IP entries are not supported.
I'm planning to do a couple of doc updates as part of the commit, but
any doc suggestions should go in a separate issue IMO. This will make
things more manageable.

> Has it been tested that the way Python uses OpenSSL really checks both
> notBefore and notAfter?

I just checked and, yes, it does (but only if you specify CERT_OPTIONAL
or CERT_REQUIRED, of course).

> AFAICS * shouldn't match the empty string. I would expect "fail(cert,
> '.a.com')".

Good point.

> I would prefer to fail to the safe side and only allow a left-most
> wildcard and thus not allow multiple or f* wildcards, just like
> draft-saintandre-tls-server-id-check-09 suggests.

Well, RFC 2818 allows them, and I see no point in being stricter.

> I would prefer to not use re for such an important task where clarity
> and correctness is so important. If we only allow left-most wildcards
> it shouldn't be necessary.

I'm not convinced that manual parsing is really more readable than
regular expressions, and wildcards are a pretty obvious use case for
regular expressions. Perhaps it's a matter of habit.

> match_hostname:
> 
> I don't understand "IP addresses are not accepted for hostname". I
> assume that if commonName specifies an IP address then a hostname with
> this address is valid. So isn't it more that "subjectAltName iPAddress
> isn't supported"?

Indeed. But, strictly speaking, there are no tests for IPs, so it
shouldn't be taken for granted that it works, even for commonName.
The rationale is that there isn't really any point in using an IP rather
a host name.

> But wouldn't it be better and simpler to simply support iPAddress -
> either as the only check iff hostname "looks" like an IP address,
> alternatively in all cases check against both DNS and IP entries?

Well, that's additional logic to code. I'm not sure it's worth it,
especially given that the function is called match_hostname in the first
place.

> "doesn't match with either of (%s)" ... isn't the paranthesis around
> the list elements too pythonic for a message intended for end users?

Hmm, perhaps.

> Separate error messages for subjectAltName and commonName could be
> helpful.

That depends if you're an end user or an SSL expert, I guess. end users
don't know what subjectAltNames and commonNames are.

> I assume it should be "no appropriate _commonName_" to match
> "subjectAltName".

Ah, yes.

> cert for example.com is defined twice.

Right.

> How about unicode and/or IDN hostnames?

I haven't looked how these work in the context of certificate checking.
I would prefer to tackle that separately if needed, but someone can
provide a patch for this if they want.
msg118080 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-06 20:32
Here is a new patch with doc updates and the corrections mentioned above.
msg118084 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-10-06 21:45
> Indeed. But, strictly speaking, there are no tests for IPs, so it
> shouldn't be taken for granted that it works, even for commonName.
> The rationale is that there isn't really any point in using an IP rather
> a host name.

I don't know if there is a point or not, but some hosts are for some 
reason intended to be connected to using IP address and their 
certificates thus contains IP addresses. I think we should support that 
too, and I find it a bit confusing to only have partial support for 
subjectAltName.

> Well, that's additional logic to code. I'm not sure it's worth it,
> especially given that the function is called match_hostname in the first
> place.

"hostname" in Python usually refers to both IP addresses and DNS 
hostnames (just like in URLs), so I think it is a fair assumption that 
IP addresses also works in this hostname function.

Perhaps it should be noted that CertificateError only is raised by 
match_hostname so a paranoid programmer don't start catching it 
everywhere - and also that match_hostname won't raise SSLError.
msg118086 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-06 21:57
> I don't know if there is a point or not, but some hosts are for some 
> reason intended to be connected to using IP address and their 
> certificates thus contains IP addresses. I think we should support that 
> too, and I find it a bit confusing to only have partial support for 
> subjectAltName.

Do you have examples? Otherwise it is difficult to implement.
msg118176 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-08 10:38
I wanted to go forward with this and so I've committed the patch in r85321. If you're concerned about the lack of support for IP addresses, you can open a new issue (and even provide a patch!). Thank you.
msg118841 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-10-15 22:50
Can you confirm that the exception raised both on "too early" and "too late" is something like "...SSL3_GET_SERVER_CERTIFICATE:certificate verify failed"?

(If so: It would be nice if a slightly more helpful message could be given. I don't know if that is possible.)
msg118844 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-15 23:00
Le vendredi 15 octobre 2010 à 22:51 +0000, Mads Kiilerich a écrit :
> Mads Kiilerich <mads@kiilerich.com> added the comment:
> 
> Can you confirm that the exception raised both on "too early" and "too
> late" is something like "...SSL3_GET_SERVER_CERTIFICATE:certificate
> verify failed"?

Yes.

> (If so: It would be nice if a slightly more helpful message could be
> given. I don't know if that is possible.)

Agreed. I don't know how to do that, though.
msg120107 - (view) Author: david (asdfasdfasdfasdfasdfasdfasdf) Date: 2010-11-01 03:45
So I know the current patch doesn't support IP addresses but I thought I would link to what mozilla considered a security problem(just for future reference):

CVE-2010-3170: http://www.mozilla.org/security/announce/2010/mfsa2010-70.html

"Security researcher Richard Moore reported that when an SSL certificate was created with a common name containing a wildcard followed by a partial IP address a valid SSL connection could be established with a server whose IP address matched the wildcard range by browsing directly to the IP address. It is extremely unlikely that such a certificate would be issued by a Certificate Authority."
msg120122 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2010-11-01 13:07
> So I know the current patch doesn't support IP addresses

Not exactly. The committed patch do not consider IP addresses - 
especially not iPAddress entries in subjectAltName. But Python only 
distinguishes resolvable names from IP addresses at a very low level. At 
the ssl module level the name and IP is considered the same, so we 
actually do support IP addresses if specified in commonName or 
subjectAltName DNS. We are thus "vulnerable" to this issue. (AFAIK AFAICS)

(It seems like IP in commonName isn't permitted by the RFCs, but I think 
it is quite common, especially for self-signed certificates.)

> CVE-2010-3170: http://www.mozilla.org/security/announce/2010/mfsa2010-70.html

For reference, the actual report can be found on 
http://www.securityfocus.com/archive/1/513396

FWIW, I don't think it is critical at all. Granted, it is a deviation 
from the specification, and that is not good in a security critical 
part. But we do not claim to implement the full specification, so I 
don't think this deviation makes any difference.

Further, this issue will only have relevance if one the trusted CAs 
create invalid certificates. But if the trusted CAs create invalid 
certificates the user has lost anyway and things can't get much worse.
msg120920 - (view) Author: anatoly techtonik (techtonik) Date: 2010-11-10 15:14
Should we escalate this issue to CVA for Python 2.x?
msg120945 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-11 12:31
> Should we escalate this issue to CVA for Python 2.x?

It's more of a missing feature than a security issue in itself, although
the missing feature has to do with security.
msg120946 - (view) Author: david (asdfasdfasdfasdfasdfasdfasdf) Date: 2010-11-11 12:52
On 11 November 2010 23:31, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> Should we escalate this issue to CVA for Python 2.x?
>
> It's more of a missing feature than a security issue in itself, although
> the missing feature has to do with security.

Still it would be nice to see in python 2.x at some point don't you think?
msg120947 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-11-11 13:10
> >
> > It's more of a missing feature than a security issue in itself, although
> > the missing feature has to do with security.
> 
> Still it would be nice to see in python 2.x at some point don't you think?

Well, lots of things would be nice to see in python 2.x, but that's not
how things work :)
That said, someone else is maintaining a backport (thank him):
http://pypi.python.org/pypi/backports.ssl_match_hostname/
msg154230 - (view) Author: Thomas Leonard (Thomas.Leonard) Date: 2012-02-25 10:20
Just to add a couple of data points to argue in favour of a secure-by-default behaviour:

0install.net:

http://secunia.com/advisories/47935 (spoofing attack due to certificate names not being validated)

Mozilla is recommending people avoid using Python's built-in SSL:

https://github.com/mozilla/browserid/wiki/Security-Considerations-when-Implementing-BrowserID

I find it hard to believe that anyone would be able to write an SSL client in Python currently without introducing some vulnerability. There are too many traps to fall into. Here are the three I know about:

1. Not specifying any trusted CAs means trust everyone (where for most software it would mean either trust no-one or trust only well-known CAs).

2. Specifiying a single trusted CA means also trust all known CAs (on MacOS X at least).

3. Unless you manually enable hostname checking, the attacker only needs a valid SSL certificate for their own site, not for the site they're spoofing.
History
Date User Action Args
2022-04-11 14:56:28adminsetgithub: 45930
2012-02-25 10:20:25Thomas.Leonardsetnosy: + Thomas.Leonard
messages: + msg154230
2010-11-17 09:47:51eric.araujosetnosy: + eric.araujo
2010-11-12 19:00:06jceasetnosy: + jcea
2010-11-11 13:10:24pitrousetmessages: + msg120947
2010-11-11 12:52:38asdfasdfasdfasdfasdfasdfasdfsetmessages: + msg120946
2010-11-11 12:31:30pitrousetmessages: + msg120945
2010-11-10 15:14:29techtoniksetnosy: + techtonik
messages: + msg120920
2010-11-01 13:07:04kiilerixsetmessages: + msg120122
2010-11-01 03:45:43asdfasdfasdfasdfasdfasdfasdfsetmessages: + msg120107
2010-10-15 23:00:30pitrousetmessages: + msg118844
2010-10-15 22:50:58kiilerixsetmessages: + msg118841
2010-10-08 10:38:19pitrousetstatus: open -> closed
resolution: fixed
messages: + msg118176

stage: patch review -> resolved
2010-10-06 21:57:19pitrousetmessages: + msg118086
2010-10-06 21:46:01kiilerixsetmessages: + msg118084
2010-10-06 20:52:27pitroulinkissue9003 dependencies
2010-10-06 20:32:39pitrousetfiles: + sslcheck2.patch

messages: + msg118080
2010-10-06 19:14:54pitrousetmessages: + msg118076
2010-10-06 18:32:49kiilerixsetmessages: + msg118073
2010-10-06 15:29:14pitrousetmessages: + msg118068
2010-10-04 17:39:43devinsetmessages: + msg117972
2010-10-04 17:19:57pitrousetmessages: + msg117970
2010-10-04 17:08:45devinsetmessages: + msg117968
2010-10-04 16:37:11pitrousetfiles: + sslcheck.patch
keywords: + patch
messages: + msg117963

stage: patch review
2010-10-04 10:37:08pitrousetmessages: + msg117942
2010-10-04 00:52:50kiilerixsetnosy: + kiilerix
messages: + msg117936
2010-09-29 20:12:14asdfasdfasdfasdfasdfasdfasdfsetmessages: + msg117647
2010-09-29 19:00:48pitrousetmessages: + msg117643
2010-09-29 18:46:26pitrousetmessages: + msg117641
2010-09-29 18:45:33debatem1setmessages: + msg117640
2010-09-29 18:42:53devinsetmessages: + msg117639
2010-09-29 18:34:25pitrousetmessages: + msg117636
2010-09-29 18:12:06Ryan.Tuckersetnosy: + Ryan.Tucker
2010-09-29 18:03:49zookosetnosy: + zooko
messages: + msg117635
2010-09-29 14:39:55asdfasdfasdfasdfasdfasdfasdfsetnosy: + asdfasdfasdfasdfasdfasdfasdf
messages: + msg117613
2010-09-29 11:52:11orsenthilsetnosy: + orsenthil
messages: + msg117606
2010-09-29 11:41:46orsenthilsetmessages: - msg58435
2010-06-20 21:32:33pitrousetfiles: - unnamed
2010-06-20 21:32:19pitrousetfiles: - unnamed
2010-06-20 21:32:15pitrousetfiles: - unnamed
2010-06-19 18:54:48devinsetnosy: + devin
2010-06-16 16:54:48jsamuelsetnosy: + jsamuel
2010-06-16 14:32:39giampaolo.rodolasetnosy: + giampaolo.rodola
2010-06-16 03:58:38debatem1setnosy: + debatem1
2010-06-15 21:17:26pitrousetstatus: closed -> open

assignee: janssen ->
versions: + Python 3.2, - Python 2.6
nosy: + pitrou

messages: + msg107895
resolution: rejected -> (no value)
2008-09-11 16:04:03janssensetfiles: + unnamed
messages: + msg73036
2008-09-11 06:24:29heikkisetmessages: + msg73006
2008-09-10 02:21:48janssensetmessages: + msg72935
2008-09-05 07:11:57heikkisetmessages: + msg72574
2008-08-21 21:12:30janssensetmessages: + msg71682
2008-08-20 22:52:15heikkisetmessages: + msg71586
2008-08-19 16:45:20janssensetstatus: open -> closed
resolution: rejected
messages: + msg71443
2008-08-19 03:21:11heikkisetnosy: + heikki
messages: + msg71405
2008-01-05 11:33:04vilasetnosy: + vila
2007-12-13 18:10:26janssensetfiles: + unnamed
messages: + msg58547
2007-12-13 15:51:41gvanrossumsetnosy: - gvanrossum
2007-12-13 15:14:30ahasenacksetmessages: + msg58535
2007-12-12 20:36:58janssensetfiles: + unnamed
messages: + msg58508
2007-12-12 12:48:24ahasenacksetmessages: + msg58491
2007-12-11 22:40:17janssensetmessages: + msg58472
2007-12-11 17:41:48gvanrossumsetassignee: janssen
messages: + msg58444
nosy: + gvanrossum, janssen
2007-12-11 15:41:53ahasenacksetmessages: + msg58435
2007-12-11 15:41:03ahasenackcreate