classification
Title: FIPS support for hashlib
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: dmalcolm, gregory.p.smith
Priority: normal Keywords: patch

Created on 2010-07-10 00:22 by dmalcolm, last changed 2011-09-16 19:57 by dmalcolm.

Files
File name Uploaded Description Edit
py3k-hashlib-fips-issue9216.patch dmalcolm, 2010-07-12 18:48 Patch against py3k branch review
0001-Rework-_hashlib-caching-moving-per-hash-cached-data-.patch dmalcolm, 2011-09-16 19:45
0002-Add-error-handling-to-initialization-of-_hashlib.patch dmalcolm, 2011-09-16 19:46
0003-Add-optional-usedforsecurity-argument-in-various-pla.patch dmalcolm, 2011-09-16 19:46
0004-_hashlib-Add-selftest-for-FIPS-mode-and-usedforsecur.patch dmalcolm, 2011-09-16 19:46
Messages (6)
msg109808 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-07-10 00:22
(taking the liberty of adding gregory.p.smith to the "nosy" list; hope that's OK)

This is a higher-level take on issue 9146.

Some versions of OpenSSL have a FIPS mode that can refuse the use of non-certified hashes.

The idea is that FIPS mode should prevent the use of non-certified hashes for security uses.  For example, MD5 shouldn't be used for signatures these days (see e.g. http://www.kb.cert.org/vuls/id/836068).

However, there are legitimate non-security uses of these hashes.  For example, one might use MD5 hashes of objects to places them in bins for later retrieval, purely as a speed optimization (e.g. files in directories on a filesystem).

I'm working on a patch to hashlib which would better support this, but it involves an API expansion, and I wanted to sound things out first.

The API idea is to introduce a new keyword argument, say "usedforsecurity" to hashlib.new() and to the named hashlib constructors, such as hashlib.md5().  This would default to True.  If code is using these hashes in FIPS mode, the developer needs to override this: usedforsecurity=False to mark the callsite as a non-security-sensitive location.  Internally, this would lead to the EVP_MD_CTX being initialized with EVP_MD_CTX_FLAG_NON_FIPS_ALLOW.

This way, if you run unaudited code in an environment that cares about FIPS, the code will raise exceptions if it uses a non-valid hash, but during code audit the callsites can be marked clearly as "usedforsecurity=False", and be used as before.

In non-FIPS environments, the flag would be ignored.

Am I right in thinking that the _hashlib module should be treated as an implementation detail here?  The entry points within _hashlib are likely to double, with a pair of pre-initialized contexts, one with the flag, one without.

Does this sound reasonable?  Thanks.
msg109891 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2010-07-10 17:00
That sounds fine to me and I do like the usedforsecurity annotation on the API.  I'll gladly review any patches.
msg110124 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-07-12 18:48
Attached is a patch against the py3k branch which implements this.

I've checked that it builds against openssl-0.9.8o.tar.gz, openssl-1.0.0a.tar.gz, and against Fedora 12 and 13's heavily-patched openssl-1.0.0. The bulk of my testing has been against Fedora's openssl.

I've added selftests to try to verify the new API.  I try to detect if the OpenSSL enforces FIPS, via trying to run "openssl md5" as a subprocess, and seeing if I can trigger an error.

With FIPS enforcement off, all tests pass when built against 0.9.8o and 1.0.0a and F13's 1.0.0, other than those for FIPS enforcement itself, which skip.

With FIPS enforcement on, all tests pass when built against F13's openssl.  (I haven't yet figured out how to get the fips selftest to pass for the other builds, it's testing checksums against the wrong libcrypto for some reason; see caveat below):
$ ./python Lib/test/test_hashlib.py
$ OPENSSL_FORCE_FIPS_MODE=1 ./python Lib/test/test_hashlib.py

For all of the various contexts stored in _hashopenssl.c, we now store two: one with the override flag, one without.  This required some reworking of the various preprocessor magic in that file, so I've gathered everything related to an algorithm into a structure, and moved most of the logic into functions, rather than macros.  I'm assuming that these will get inlined under optimization, and that the bulk of the time that you're trying to optimize out are the EVP lookups and initializations, rather than function call overhead.

How's this looking?

Do I need to add a dummy "usedforsecurity" arg to all of the non-openssl message digest implementations within the tree?


Unfortunately, if fips mode is on, and the fips selftest fails for the openssl library, every hash use will fail, both with and without the flag:
  ValueError: error:2D07D06A:FIPS routines:EVP_DigestInit_ex:fips selftest failed
and this leads to a crippled hashlib module.  It's not clear to me if there's a good way to handle this.  (Having said that, a site that has the technical expertise to opt-in to FIPS mode is hopefully able to diagnose this, and fix their openssl library)
msg144152 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-09-16 19:45
I've refreshed this patch against the latest version of the code in hg.

In an attempt to make it easier to review, I've split it up into four (so far) thematic patches, which apply in sequence.
msg144153 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-09-16 19:48
[and yes, I used git to generate the 4 patches; sorry ]
msg144154 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-09-16 19:57
The cumulative effect of the above patches (to _hashlib) are equivalent to what I've applied downstream to python 2 in RHEL 6.0 and Fedora 17 onwards, and python 3 in Fedora 17 onwards.

In those environments I've additionally patched hashlib to only use _hashlib, rather than falling back on _md5 etc, since otherwise you get confusing error messages from hashlib.md5() when it defers to _md5 due to FIPS enforcement.  In my downstream builds we can be sure of building against OpenSSL, but this other part of the patch seems less appropriate for upstream python, given that upstream python tries to be flexible in terms of its dependencies.

Hope this makes sense.
History
Date User Action Args
2011-09-16 19:57:49dmalcolmsetmessages: + msg144154
2011-09-16 19:48:02dmalcolmsetmessages: + msg144153
2011-09-16 19:46:48dmalcolmsetfiles: + 0004-_hashlib-Add-selftest-for-FIPS-mode-and-usedforsecur.patch
2011-09-16 19:46:39dmalcolmsetfiles: + 0003-Add-optional-usedforsecurity-argument-in-various-pla.patch
2011-09-16 19:46:24dmalcolmsetfiles: + 0002-Add-error-handling-to-initialization-of-_hashlib.patch
2011-09-16 19:45:46dmalcolmsetfiles: + 0001-Rework-_hashlib-caching-moving-per-hash-cached-data-.patch

messages: + msg144152
2011-01-03 19:53:51pitrousetversions: + Python 3.3, - Python 3.2
2010-12-14 19:07:37r.david.murraysettype: enhancement
2010-07-12 18:48:15dmalcolmsetfiles: + py3k-hashlib-fips-issue9216.patch
keywords: + patch
messages: + msg110124

stage: needs patch -> patch review
2010-07-10 17:00:04gregory.p.smithsetmessages: + msg109891
2010-07-10 00:22:13dmalcolmcreate