classification
Title: add BLAKE2 to hashlib
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Zooko.Wilcox-O'Hearn, alex, christian.heimes, dchest, dstufft, gregory.p.smith, martin.panter, python-dev
Priority: normal Keywords: patch

Created on 2016-04-18 18:40 by Zooko.Wilcox-O'Hearn, last changed 2016-09-08 11:41 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
BLAKE2-hash-algorithm-for-CPython.patch christian.heimes, 2016-05-08 13:10 review
BLAKE2-hash-algorithm-for-CPython-2.patch christian.heimes, 2016-05-08 18:04 review
BLAKE2-hash-algorithm-for-CPython-3.patch christian.heimes, 2016-06-02 18:46 review
BLAKE2-hash-algorithm-for-CPython-4.patch christian.heimes, 2016-08-20 22:37 review
BLAKE2-hash-algorithm-for-CPython-5.patch christian.heimes, 2016-09-04 15:11 review
Messages (27)
msg263679 - (view) Author: Zooko Wilcox-O'Hearn (Zooko.Wilcox-O'Hearn) Date: 2016-04-18 18:40
(Disclosure: I'm one of the authors of BLAKE2.)

Please include BLAKE2 in hashlib. It well-suited for hashing long inputs (e.g. files), because it is substantially faster than SHA-3, SHA-2, SHA-1, or MD5 while also being safer than SHA-2, SHA-1, or MD5.

BLAKE2 and/or its relatives, BLAKE, ChaCha20, and Salsa20, have gotten extensive cryptographic peer review.

It is widely used in modern applications and widely supported in modern crypto libraries (https://en.wikipedia.org/wiki/BLAKE_%28hash_function%29#BLAKE2_uses).

Here is the official reference implementation: https://github.com/BLAKE2

Here are some Python modules (wrappers or implementations):
 * https://github.com/buggywhip/blake2_py
 * https://github.com/dchest/pyblake2
 * https://github.com/darjeeling/python-blake2
msg263680 - (view) Author: Zooko Wilcox-O'Hearn (Zooko.Wilcox-O'Hearn) Date: 2016-04-18 18:41
Oh, just to be clear, I didn't mean to imply that BLAKE2 is _less_ safe than SHA-3. My best estimate is that BLAKE2 and SHA-3 are equivalently safe, and that either of them is safer than SHA-2, SHA-1, or MD5.
msg263682 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2016-04-18 18:44
Right now all the hashlib algorithms are backed by OpenSSL. OpenSSL 1.1.0 will have blake2, so perhaps the right move is just to wait for that to drop in a few weeks?

Sadly many users with old OpenSSL's still won't have blake2, but pretty quickly Windows and OS X users will get blake2!
msg263683 - (view) Author: Donald Stufft (dstufft) * (Python committer) Date: 2016-04-18 18:49
> Right now all the hashlib algorithms are backed by OpenSSL.

As far as I know, hashlib ships it's own implementations of anything that is a guaranteed algorithms (currently md5, sha1, and sha2, presumably sha3 too once that gets added).

So I guess one question is whether we want to guarantee the existence of blake2 or not.
msg263687 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-04-18 19:26
I have SHA-3, SHAKE and BLAKE2s / BLAKE2b on my radar.

PEP 247 and the current API definition of the hashlib module is too limited for the new hashing algorithms. It lacks variable output for SHAKE as well as salt and personalization for BLAKE. I started to work on PEP 452 while I was working on SHA-3 support for Python 3. The PEP is still work-in-progress. I can address the needs of BLAKE and submit the PEP to a formal review.

I'm already working on OpenSSL 1.1.0 support for Python. blake will be automatically supported with OpenSSL builds. See for yourself:

$ ./python 
Python 3.6.0a0 (default, Apr 18 2016, 21:16:54) 
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> import ssl
>>> ssl.OPENSSL_VERSION
'OpenSSL 1.1.0-pre5-dev  xx XXX xxxx'
>>> hashlib.new('BLAKE2s256').hexdigest()
'69217a3079908094e11121d042354a7c1f55b6482ca1a51e1b250dfd1ed0eef9'


Donald:
I love to add SHA-3 support to Python. After all I had a working patch for SHA-3 before NIST changed the padding. Now my old patch is worthless. The new reference implementations are either too complicated (spread across like twenty files for various optimizations) or are one-shot implementations. I'm still looking for a good implementation that supports incremental updates, copy and SHAKE at the same time.
msg263693 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2016-04-18 21:27
CPython should not attempt make a judgement about the safety of a particular function.  We can only document if something has known issues.

We should only include things in the stdlib which are either (a) standard or (b) widely used regardless of being standard.  Given that librsync and RAR both support BLAKE2 and that it is defined by an RFC, I'd say it should go in.

The reference implementation appears to have Apache 2.0 licensed C code would can fall back to when the OpenSSL in use does not include it.

PEP-452 seems to address the necessary API changes.
msg263696 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-04-18 21:51
BLAKE2 is under CC0 1.0 Universal, https://github.com/BLAKE2/libb2

OpenSSL 1.1.0 has no API to set salt, personal, digest length and key length (for keyed BLAKE2). I have asked Richard Salz and MJC if they'd accept a patch.

Or I could ask Dmitry Chestnykh if he is interested to submit his pyblake2 module to the stdlib. It's under CC0. It looks pretty good except for his macro tricks. I can cope. :)
msg263703 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2016-04-19 02:15
Confirm that the CC license is valid/acceptable with Van L. first. That's
why I pointed at the Apache 2 code, already a known good license. :)

On Mon, Apr 18, 2016, 2:51 PM Christian Heimes <report@bugs.python.org>
wrote:

>
> Christian Heimes added the comment:
>
> BLAKE2 is under CC0 1.0 Universal, https://github.com/BLAKE2/libb2
>
> OpenSSL 1.1.0 has no API to set salt, personal, digest length and key
> length (for keyed BLAKE2). I have asked Richard Salz and MJC if they'd
> accept a patch.
>
> Or I could ask Dmitry Chestnykh if he is interested to submit his pyblake2
> module to the stdlib. It's under CC0. It looks pretty good except for his
> macro tricks. I can cope. :)
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue26798>
> _______________________________________
>
msg263853 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-04-20 17:24
First experimental version of blake2b and blake2s for Python 3.6:
https://github.com/tiran/cpython/commits/feature/blake2

The code is based on Dmitry Chestnykh's pyblake2 but modified to support argument clinic. I had to replace the macro magic with plain C code. The clinic is not able to deal with the macros. blake2s.c is auto-generated from blake2b.c
msg265137 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-05-08 13:10
Here is my first patch. I have tested it on X86_64 (m64 and m32) and ARMv7. It should compile on Windows but I don't have a working Windows box on my box. 

Dmitry, I have copied your documentation. Are you fine with that?

TODO:

- set BLAKE2_USE_SSE on win32 X86_64
- more test vectorss for advanced use cases (salt, personal, tree hashing)
msg265141 - (view) Author: Dmitry Chestnykh (dchest) Date: 2016-05-08 14:02
Christian: yes, and I'm also happy that you kept the drawing of hash tree, as it helps a lot with understanding of terminology.

I had a quick look at the patch and it looks good to me.

Some comments, which you can ignore:

In keyed hashing example there's:

+    >>> def verify(cookie, sig):
+    ...     good_sig = sign(cookie)
+    ...     if len(sig) != len(good_sig):
+    ...          return False
+    ...     # Use constant-time comparison to avoid timing attacks.
+    ...     result = 0
+    ...     for x, y in zip(sig, good_sig):
+    ...         result |= ord(x) ^ ord(y)
+    ...     return result == 0

Perhaps, you can replace comparison with hmac.compare_digest(sig, goodsig) now that we have it in Python.

+    *Salted hashing* (or just hashing) with BLAKE2 or any other general-purpose
+    cryptographic hash function, such as SHA-256, is not suitable for hashing
+    passwords.  See `BLAKE2 FAQ <https://blake2.net/#qa>`_ for more
+    information.

Maybe also point readers to hashlib.html#key-derivation


+On platforms with support for 64bit integer types (some 32bit platforms,
+all 64bit platforms), blake2b and blake2s are supported.

Theoretically, blake2s shouldn't require 64bit int, reference code only uses it for sizes -- is this something worth fixing? Are there platforms that have uint32_t, but not uint64_t?


@@ -162,7 +192,7 @@ class HashLibTestCase(unittest.TestCase):
         for cons in self.hash_constructors:
             h = cons()
             self.assertIsInstance(h.name, str)
-            self.assertIn(h.name, self.supported_hash_names)
+            #self.assertIn(h.name, self.supported_hash_names)
             self.assertEqual(h.name, hashlib.new(h.name).name)


Was this commented-out on purpose? Also, in setup.py:

+            #if os.uname().machine == "x86_64":
+                # Every x86_64 machine has at least SSE2.
+                #blake2_macros.append(('BLAKE2_USE_SSE', '1'))
msg265151 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-05-08 18:04
Thanks for your review, Dmitry.

I have replaced verify() with compare_digest().

Python requires a C89 compatible compiler and 32bit architecture. C89 doesn't mandate 64bit integers. As far as I remember there is (or was) one buildbot with a compiler, that doesn't have 64 ints on an old SPARC system. All major platforms have 64bit ints. I might modify the implementation when the patch has landed.

#self.assertIn(h.name, self.supported_hash_names)
I now check for guaranteed and eventually supported hashes.

SSE is enabled on X64_86. I forgot to remove the comments.


The test suite is missing tests for salt, personal and tree hashing. I have asked Zooko and JPA for vectors.
msg265159 - (view) Author: Dmitry Chestnykh (dchest) Date: 2016-05-08 19:08
> I have replaced verify() with compare_digest().

+    >>> compare_digesty(cookie, '0102030405060708090a0b0c0d0e0f00')

Typo here. Also, this doesn't look like it compares the digest. Maybe you can keep the verify() function, but make it use compare_digest() -- this looks more clear to me:

>>> def verify(cookie, sig):
...     good_sig = sign(cookie)
...     return compare_digest(goodsig, sig)


> Python requires a C89 compatible compiler and 32bit architecture. C89 doesn't mandate 64bit integers. As far as I remember there is (or was) one buildbot with a compiler, that doesn't have 64 ints on an old SPARC system. All major platforms have 64bit ints. I might modify the implementation when the patch has landed.

Oh, I see. Thanks!
msg266910 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-06-02 18:46
New patch:

- I moved the test vectors out of the repos. They are currently hosted on github. I'll move them to pythontest infra later.

- I dropped special case for systems w/o uint64. Python 3.6+ requires 64 bit int types to compile.
msg273249 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-08-20 22:37
Patch 4 uses the latest revision of the reference implementation.
msg274363 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-04 13:59
Maybe call the RST file hashlib-blake2.rst (hyphen, not dot), otherwise it looks like it is a separate submodule under the hashlib package.

Also there seems to be a versionadded tag missing in the documentation.

In setup.py, the os.uname().machine check seems unsafe for cross-compilation, though I am not certain, and I don’t have a better suggestion.
msg274364 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-04 15:11
Thanks for your review. I have addressed your points and updated/fixed/renamed documentation and comments.
msg274615 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 20:03
New changeset 4969f6d343b1 by Christian Heimes in branch 'default':
Issue #26798: Add BLAKE2 (blake2b and blake2s) to hashlib.
https://hg.python.org/cpython/rev/4969f6d343b1
msg274632 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 21:18
New changeset be6f3449ac13 by Christian Heimes in branch 'default':
Issue #26798: for loop initial declarations are only allowed in C99 or C11 mode
https://hg.python.org/cpython/rev/be6f3449ac13
msg274640 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 22:10
New changeset 5c31599de76a by Christian Heimes in branch 'default':
Issue #26798: for loop initial declarations, take 2
https://hg.python.org/cpython/rev/5c31599de76a
msg274643 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 22:32
New changeset a1e032dbcf86 by Christian Heimes in branch 'default':
Issue #26798: for loop initial declarations, take 3
https://hg.python.org/cpython/rev/a1e032dbcf86
msg274671 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-06 23:45
New changeset afa5a16456ed by Christian Heimes in branch 'default':
Issue #26798: Hello Winndows, my old friend. I've come to fix blake2 for you again.
https://hg.python.org/cpython/rev/afa5a16456ed
msg274960 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-08 03:29
Seems the test fails if the installed Python tree is not writable:
http://buildbot.python.org/all/builders/x86%20Gentoo%20Installed%20with%20X%203.x/builds/1005/steps/test/logs/stdio
======================================================================
ERROR: test_blake2b_vectors (test.test_hashlib.HashLibTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/test_hashlib.py", line 537, in test_blake2b_vectors
    for msg, key, md in read_vectors('blake2b'):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/test_hashlib.py", line 50, in read_vectors
    with support.open_urlresource(URL.format(hash_name)) as f:
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/support/__init__.py", line 1061, in open_urlresource
    with open(fn, "wb") as out:
PermissionError: [Errno 13] Permission denied: '/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/data/blake2b.txt'

======================================================================
ERROR: test_blake2s_vectors (test.test_hashlib.HashLibTestCase)
msg274974 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-08 06:09
.
Looks like Issue 28001 is open about the same problem. But I presume other tests use open_urlresource(), so why don’t they also fail?
msg274986 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-08 09:00
Other tests catch OSError and HTTPError. I have changed read_vectors() to do the same for now.
msg274987 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-08 09:07
New changeset 46b34706eb41 by Christian Heimes in branch 'default':
Issue 26798: fetch OSError and HTTPException like other tests that use open_urlresource.
https://hg.python.org/cpython/rev/46b34706eb41
msg275001 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-08 11:41
New changeset fa89fff0b52c by Christian Heimes in branch 'default':
Issue #26798: Coverity complains about potential memcpy() of overlapped regions. It doesn't hurt to use memmove() here. CID 1372514 / CID 1372515. Upstream https://github.com/BLAKE2/BLAKE2/issues/32
https://hg.python.org/cpython/rev/fa89fff0b52c
History
Date User Action Args
2016-09-08 11:41:00python-devsetmessages: + msg275001
2016-09-08 09:50:43christian.heimessetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-09-08 09:07:00python-devsetmessages: + msg274987
2016-09-08 09:00:54christian.heimessetmessages: + msg274986
2016-09-08 06:09:54martin.pantersetmessages: + msg274974
2016-09-08 03:29:56martin.pantersetmessages: + msg274960
2016-09-06 23:45:44python-devsetmessages: + msg274671
2016-09-06 22:32:15python-devsetmessages: + msg274643
2016-09-06 22:10:22python-devsetmessages: + msg274640
2016-09-06 21:18:14python-devsetmessages: + msg274632
2016-09-06 20:03:36python-devsetnosy: + python-dev
messages: + msg274615
2016-09-04 15:12:20christian.heimessetfiles: + BLAKE2-hash-algorithm-for-CPython-5.patch

messages: + msg274364
2016-09-04 13:59:59martin.pantersetnosy: + martin.panter
messages: + msg274363
2016-08-20 22:38:00christian.heimessetfiles: + BLAKE2-hash-algorithm-for-CPython-4.patch

messages: + msg273249
2016-06-12 11:21:26christian.heimessetassignee: christian.heimes ->
2016-06-02 18:47:21christian.heimessetfiles: + BLAKE2-hash-algorithm-for-CPython-3.patch

messages: + msg266910
2016-05-08 19:08:32dchestsetmessages: + msg265159
2016-05-08 18:04:23christian.heimessetfiles: + BLAKE2-hash-algorithm-for-CPython-2.patch

messages: + msg265151
2016-05-08 14:02:24dchestsetnosy: + dchest
messages: + msg265141
2016-05-08 13:10:49christian.heimessetfiles: + BLAKE2-hash-algorithm-for-CPython.patch
keywords: + patch
messages: + msg265137

stage: needs patch -> patch review
2016-04-20 17:24:32christian.heimessetmessages: + msg263853
2016-04-19 02:15:06gregory.p.smithsetmessages: + msg263703
2016-04-18 21:51:09christian.heimessetmessages: + msg263696
2016-04-18 21:27:30gregory.p.smithsetversions: + Python 3.6
messages: + msg263693

assignee: christian.heimes
type: enhancement
stage: needs patch
2016-04-18 19:26:55christian.heimessetmessages: + msg263687
2016-04-18 18:53:34ned.deilysetnosy: + gregory.p.smith, christian.heimes
2016-04-18 18:49:21dstufftsetmessages: + msg263683
2016-04-18 18:44:21alexsetnosy: + alex
messages: + msg263682
2016-04-18 18:41:15Zooko.Wilcox-O'Hearnsetmessages: + msg263680
2016-04-18 18:40:12Zooko.Wilcox-O'Hearncreate