classification
Title: New SSL module doesn't seem to verify hostname against commonName in certificate
Type: security Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: janssen Nosy List: ahasenack, heikki, janssen, vila (4)
Priority: Keywords:

Created on 2007-12-11 15:41 by ahasenack, last changed 2008-09-11 16:04 by janssen.

Files
File name Uploaded Description Edit Remove
verisign-inc-class-3-public-primary.pem ahasenack, 2007-12-11 15:41
unnamed janssen, 2007-12-12 20:36
unnamed janssen, 2007-12-13 18:10
unnamed janssen, 2008-09-11 16:04
Messages (16)
msg58434 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-11 15:41
(I hope I used the correct component for this report)

http://pypi.python.org/pypi/ssl/

I used the client example shown at
http://docs.python.org/dev/library/ssl.html#client-side-operation to
connect to a bank site called www.realsecureweb.com.br at
200.208.16.101. Its certificate signed by verisign. My OpenSSL has this
CA at /etc/pki/tls/rootcerts/verisign-inc-class-3-public-primary.pem.
The verification works.

If I make up a hostname called something else, like "wwws", and place it
in /etc/hosts pointing to that IP address, the SSL connection should not
be established because that name doesn't match the common name field in
the server certificate. But the SSL module happily connects to it
(excerpt below):

cert = verisign-inc-class-3-public-primary.pem
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
ssl_sock = ssl.wrap_socket(s,
           ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
           cert_reqs=ssl.CERT_REQUIRED)
ssl_sock.connect(('wwws', 443))
print repr(ssl_sock.getpeername())

output:
('200.208.16.101', 443)
('RC4-MD5', 'TLSv1/SSLv3', 128)
{'notAfter': 'Sep 10 23:59:59 2008 GMT',
 'subject': ((('countryName', u'BR'),),
             (('stateOrProvinceName', u'Sao Paulo'),),
             (('localityName', u'Sao Paulo'),),
             (('organizationName', u'Banco ABN AMRO Real SA'),),
             (('organizationalUnitName', u'TI Internet PF e PJ'),),
             (('commonName', u'www.realsecureweb.com.br'),))}

If I now open, say, a firefox window and point it to "https://wwws", it
gives me the expected warning that the hostname doesn't match the
certificate.

I'll attach the verisign CA certificate to make it easier to reproduce
the error.
msg58435 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-11 15:41
Ups, typo in the script:
cert = "verisign-inc-class-3-public-primary.pem"
msg58444 - (view) Author: Guido van Rossum (gvanrossum) Date: 2007-12-11 17:41
Bill, can you respond?
msg58472 - (view) Author: Bill Janssen (janssen) Date: 2007-12-11 22:40
Unfortunately, hostname matching is one of those ideas that seemed
better when it was thought up than it actually proved to be in practice.
 I've had extensive experience with this, and have found it to almost
always an application-specific decision.  I thought about this when
designing the client-side verification, and couldn't see any automatic
solution that doesn't get in the way.  So the right way to do this with
Python is to write some application code that looks at the verified
identity and makes decisions based on whatever authentication algorithm
you need.
msg58491 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-12 12:48
At the least it should be made clear in the documentation that the
hostname is not checked against the commonName nor the subjectAltName
fields of the server certificate. And add some sample code to the
documentation for doing a simple check. Something like this, to illustrate:

def get_subjectAltName(cert):
        if not cert.has_key('subjectAltName'):
                return []
        ret = []
        for rdn in cert['subjectAltName']:
                if rdn[0].lower() == 'dns' or rdn[0][:2].lower() == 'ip':
                        ret.append(rdn[1])
        return ret

def get_commonName(cert):
        if not cert.has_key('subject'):
                return []
        ret = []
        for rdn in cert['subject']:
                if rdn[0][0].lower() == 'commonname':
                        ret.append(rdn[0][1])
        return ret


def verify_hostname(cert, host):
        cn = get_commonName(cert)
        san = get_subjectAltName(cert)
        return (host in cn) or (host in san)
msg58508 - (view) Author: Bill Janssen (janssen) Date: 2007-12-12 20:36
Yes, I think that's reasonable.  And for pseudo-standards like https, which
calls for this, the implementation in the standard library should attempt to
do it automatically.  Unfortunately, that means that client-side certificate
verification has to be done (it's pointless to look at the data in
unverified certificates), and that means that the client software has to
have an appropriate collection of root certificates to verify against.  I
think there's an argument for adding a registry of root certificates to the
SSL module, just a module-level variable that the application can bind to a
filename of a file containing their collection of certificates.  If it's
non-None, the https code would use it to verify the certificate, then use
the commonName in the subject field to check against the hostname in the
URL.  If it's None, the check would be skipped.

Bill

On Dec 12, 2007 4:48 AM, Andreas Hasenack <report@bugs.python.org> wrote:

>
> Andreas Hasenack added the comment:
>
> At the least it should be made clear in the documentation that the
> hostname is not checked against the commonName nor the subjectAltName
> fields of the server certificate. And add some sample code to the
> documentation for doing a simple check. Something like this, to
> illustrate:
>
> def get_subjectAltName(cert):
>        if not cert.has_key('subjectAltName'):
>                return []
>        ret = []
>        for rdn in cert['subjectAltName']:
>                if rdn[0].lower() == 'dns' or rdn[0][:2].lower() == 'ip':
>                        ret.append(rdn[1])
>        return ret
>
> def get_commonName(cert):
>        if not cert.has_key('subject'):
>                return []
>        ret = []
>        for rdn in cert['subject']:
>                if rdn[0][0].lower() == 'commonname':
>                        ret.append(rdn[0][1])
>        return ret
>
>
> def verify_hostname(cert, host):
>        cn = get_commonName(cert)
>        san = get_subjectAltName(cert)
>        return (host in cn) or (host in san)
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> __________________________________
>
msg58535 - (view) Author: Andreas Hasenack (ahasenack) Date: 2007-12-13 15:14
> do it automatically.  Unfortunately, that means that client-side
certificate
> verification has to be done (it's pointless to look at the data in
> unverified certificates), and that means that the client software has to
> have an appropriate collection of root certificates to verify against.  I

But the current API already has this feature:
ssl_sock = ssl.wrap_socket(s, ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
                      cert_reqs=ssl.CERT_REQUIRED)

So this is already taken care of with ca_certs and cert_reqs, right?
msg58547 - (view) Author: Bill Janssen (janssen) Date: 2007-12-13 18:10
The mechanism is there for direct use of the SSL module, yes.  But the
question is, what should indirect usage, like the httplib or urllib modules,
do?  If they are going to check hostnames on use of an https: URL, they need
some way to pass a ca_certs file through to the SSL code they use.

Bill

On Dec 13, 2007 7:14 AM, Andreas Hasenack <report@bugs.python.org> wrote:

>
> Andreas Hasenack added the comment:
>
> > do it automatically.  Unfortunately, that means that client-side
> certificate
> > verification has to be done (it's pointless to look at the data in
> > unverified certificates), and that means that the client software has to
> > have an appropriate collection of root certificates to verify against.
>  I
>
> But the current API already has this feature:
> ssl_sock = ssl.wrap_socket(s, ca_certs="/etc/pki/tls/rootcerts/%s" % cert,
>                      cert_reqs=ssl.CERT_REQUIRED)
>
> So this is already taken care of with ca_certs and cert_reqs, right?
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> __________________________________
>
msg71405 - (view) Author: Heikki Toivonen (heikki) Date: 2008-08-19 03:21
I would definitely recommend providing as strict as possible hostname
verification in the stdlib, but provide application developers a way to
override that.

M2Crypto (and TLS Lite, from which I copied the approach to M2Crypto),
provide a default post connection checker. See
http://svn.osafoundation.org/m2crypto/trunk/M2Crypto/SSL/Connection.py
and the set_post_connection_check_callback() as well as
http://svn.osafoundation.org/m2crypto/trunk/M2Crypto/SSL/Checker.py.
msg71443 - (view) Author: Bill Janssen (janssen) Date: 2008-08-19 16:45
Nope.  Hostname verification was never a good idea -- the "hostname" is
just a vague notion, at best -- lots of hostnames can map to one or more
IP addresses of the server.  It's exposed to the application code, so if
a client application wants to do it, it can.  But I recommend against
it.  It's a complication that doesn't belong in the basic support, that
is, the SSL module.  I'll add a note to this effect in the documentation.
msg71586 - (view) Author: Heikki Toivonen (heikki) Date: 2008-08-20 22:52
I would think most people/applications want to know to which host they
are talking to. The reason I am advocating adding a default check to the
stdlib is because this is IMO important for security, and it is easy to
get it wrong (I don't think I have it 100% correct in M2Crypto either,
although I believe it errs on the side of caution). I believe it would
be a disservice to ship something that effectively teaches developers to
ignore security (like the old socket.ssl does).

A TLS extension also allows SSL vhosts, so static IPs are no longer
strictly necessary (this is not universally supported yet, though).
msg71682 - (view) Author: Bill Janssen (janssen) Date: 2008-08-21 21:12
checking hostnames is false security, not real security.

On 8/20/08, Heikki Toivonen <report@bugs.python.org> wrote:
>
>  Heikki Toivonen <hjtoi-bugzilla@comcast.net> added the comment:
>
>
> I would think most people/applications want to know to which host they
>  are talking to. The reason I am advocating adding a default check to the
>  stdlib is because this is IMO important for security, and it is easy to
>  get it wrong (I don't think I have it 100% correct in M2Crypto either,
>  although I believe it errs on the side of caution). I believe it would
>  be a disservice to ship something that effectively teaches developers to
>  ignore security (like the old socket.ssl does).
>
>  A TLS extension also allows SSL vhosts, so static IPs are no longer
>  strictly necessary (this is not universally supported yet, though).
>
>
>  _______________________________________
>  Python tracker <report@bugs.python.org>
>  <http://bugs.python.org/issue1589>
>  _______________________________________
>
msg72574 - (view) Author: Heikki Toivonen (heikki) Date: 2008-09-05 07:11
Could you clarify your comment regarding hostname check being false
security?

Just about all SSL texts I have read say you must do that, and that is
what your web browser and email client does to ensure it is talking to
the right host, for example. Without that check you are subject to a man
in the middle attack. Or is there some other check you perform that is
better?
msg72935 - (view) Author: Bill Janssen (janssen) Date: 2008-09-10 02:21
Sorry to be so brief there -- I was off on vacation.

Verifying hostnames is a prescription that someone (well, OK, Eric
Rescorla, who knows what he's talking about) put in the https IETF RFC
(which, by the way, is only an informational RFC, not standards-track).
 It's a good idea if you're a customer trying to talk to Wells-Fargo,
say, over an https connection, but isn't suitable for all https traffic.
 I support putting it in the httplib Https class by default, but there
should be a way to override it, as there is with the Java APIs for https
connections.  (Take a look at javax.net.ssl.HostnameVerifier; one of the
more popular Java classes at PARC is a version of this that verifies any
hostname).

So what's wrong with it?  There are two problems.  The first is that
certificates for services are all about the hostname, and that's just
wrong.  You should verify the specific service, not just the hostname. 
    So a client that really cares about what they are talking to should
have the certificate for that service, and verify that it is the service
it's talking to, and ignore the hostname in the URL.

But the larger problem is that hostnames are a DNS construct for humans,
and not really well supported on computers, or by the services that run
on those computers.  Most computers have only the haziest notion of what
their hostname is, and many have lots of different hostnames (my laptop
has at least five hostnames that I know of, all meaning the same
computer, but with five different PARC IP addresses).  So the services
running on that computer aren't real clear about their hostnames,
either.  If I run a service on that computer that I secure with SSL, so
that packets going over my WiFi are encrypted, which hostname should
that service declare itself to be in the certificate?  And the services
on that computer keep running, even when it switches its IP address (and
thus its set of hostnames).  So doing hostname matching provokes lots of
false negatives, especially when it's not needed.  I think it by and
large isn't a good idea, though I support having it (in an overrideable
form) for the client-side https class in httplib.

This is all exacerbated by the fact that HTTP isn't what it was when
Eric wrote that RFC eight years ago.  The growth of Web 2.0 and
"RESTful" services means that lots of new things are using https in a
much less formal way, more to get encrypted packets than to verify
endpoints.  So false negatives caused by mindless hostname verification
cause real damage.
msg73006 - (view) Author: Heikki Toivonen (heikki) Date: 2008-09-11 06:24
Ok, thank you for clarifications. Now I understand why the hostname
checking isn't the solution that fits every problem. I am still not
completely clear how you'd do the checking otherwise, for example to
verify the service you are talking to is what you think it is.

But still, I think dealing with email servers is another common use case
where hostname check is adequate most of the time. I am sure there are
other cases like this. Therefore I am still of the opinion that the
default should be to do the hostname check. Yes, make it overridable,
but doing the check is safer than not doing any checking IMO because
even if the check is incorrect for a certain purpose the developer is
likely to notice an error quickly and inclined to do some other security
check instead of not doing anything and thinking they have a secure system.

If you want to continue the discussion, we should maybe take this to
some other forum, like comp.lang.python.
msg73036 - (view) Author: Bill Janssen (janssen) Date: 2008-09-11 16:04
I think that, where it's appropriate, you can do that.  Just don't put it in
the SSL module.

Bill

On Wed, Sep 10, 2008 at 11:24 PM, Heikki Toivonen <report@bugs.python.org>wrote:

>
> Heikki Toivonen <hjtoi-bugzilla@comcast.net> added the comment:
>
> Ok, thank you for clarifications. Now I understand why the hostname
> checking isn't the solution that fits every problem. I am still not
> completely clear how you'd do the checking otherwise, for example to
> verify the service you are talking to is what you think it is.
>
> But still, I think dealing with email servers is another common use case
> where hostname check is adequate most of the time. I am sure there are
> other cases like this. Therefore I am still of the opinion that the
> default should be to do the hostname check. Yes, make it overridable,
> but doing the check is safer than not doing any checking IMO because
> even if the check is incorrect for a certain purpose the developer is
> likely to notice an error quickly and inclined to do some other security
> check instead of not doing anything and thinking they have a secure system.
>
> If you want to continue the discussion, we should maybe take this to
> some other forum, like comp.lang.python.
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1589>
> _______________________________________
>
History
Date User Action Args
2008-09-11 16:04:03janssensetfiles: + unnamed
messages: + msg73036
2008-09-11 06:24:29heikkisetmessages: + msg73006
2008-09-10 02:21:48janssensetmessages: + msg72935
2008-09-05 07:11:57heikkisetmessages: + msg72574
2008-08-21 21:12:30janssensetmessages: + msg71682
2008-08-20 22:52:15heikkisetmessages: + msg71586
2008-08-19 16:45:20janssensetstatus: open -> closed
resolution: rejected
messages: + msg71443
2008-08-19 03:21:11heikkisetnosy: + heikki
messages: + msg71405
2008-01-05 11:33:04vilasetnosy: + vila
2007-12-13 18:10:26janssensetfiles: + unnamed
messages: + msg58547
2007-12-13 15:51:41gvanrossumsetnosy: - gvanrossum
2007-12-13 15:14:30ahasenacksetmessages: + msg58535
2007-12-12 20:36:58janssensetfiles: + unnamed
messages: + msg58508
2007-12-12 12:48:24ahasenacksetmessages: + msg58491
2007-12-11 22:40:17janssensetmessages: + msg58472
2007-12-11 17:41:48gvanrossumsetassignee: janssen
messages: + msg58444
nosy: + gvanrossum, janssen
2007-12-11 15:41:53ahasenacksetmessages: + msg58435
2007-12-11 15:41:03ahasenackcreate