Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add BLAKE2 to hashlib #70985

Closed
ZookoWilcox-OHearn mannequin opened this issue Apr 18, 2016 · 27 comments
Closed

add BLAKE2 to hashlib #70985

ZookoWilcox-OHearn mannequin opened this issue Apr 18, 2016 · 27 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@ZookoWilcox-OHearn
Copy link
Mannequin

ZookoWilcox-OHearn mannequin commented Apr 18, 2016

BPO 26798
Nosy @gpshead, @tiran, @alex, @vadmium, @dstufft, @dchest
Files
  • BLAKE2-hash-algorithm-for-CPython.patch
  • BLAKE2-hash-algorithm-for-CPython-2.patch
  • BLAKE2-hash-algorithm-for-CPython-3.patch
  • BLAKE2-hash-algorithm-for-CPython-4.patch
  • BLAKE2-hash-algorithm-for-CPython-5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-08.09:50:43.095>
    created_at = <Date 2016-04-18.18:40:12.380>
    labels = ['type-feature', 'library']
    title = 'add BLAKE2 to hashlib'
    updated_at = <Date 2016-09-08.11:41:00.151>
    user = 'https://bugs.python.org/ZookoWilcox-OHearn'

    bugs.python.org fields:

    activity = <Date 2016-09-08.11:41:00.151>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-08.09:50:43.095>
    closer = 'christian.heimes'
    components = ['Library (Lib)']
    creation = <Date 2016-04-18.18:40:12.380>
    creator = "Zooko.Wilcox-O'Hearn"
    dependencies = []
    files = ['42778', '42783', '43106', '44175', '44358']
    hgrepos = []
    issue_num = 26798
    keywords = ['patch']
    message_count = 27.0
    messages = ['263679', '263680', '263682', '263683', '263687', '263693', '263696', '263703', '263853', '265137', '265141', '265151', '265159', '266910', '273249', '274363', '274364', '274615', '274632', '274640', '274643', '274671', '274960', '274974', '274986', '274987', '275001']
    nosy_count = 8.0
    nosy_names = ['gregory.p.smith', 'christian.heimes', 'alex', 'python-dev', 'martin.panter', 'dstufft', "Zooko.Wilcox-O'Hearn", 'dchest']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue26798'
    versions = ['Python 3.6']

    @ZookoWilcox-OHearn
    Copy link
    Mannequin Author

    ZookoWilcox-OHearn mannequin commented Apr 18, 2016

    (Disclosure: I'm one of the authors of BLAKE2.)

    Please include BLAKE2 in hashlib. It well-suited for hashing long inputs (e.g. files), because it is substantially faster than SHA-3, SHA-2, SHA-1, or MD5 while also being safer than SHA-2, SHA-1, or MD5.

    BLAKE2 and/or its relatives, BLAKE, ChaCha20, and Salsa20, have gotten extensive cryptographic peer review.

    It is widely used in modern applications and widely supported in modern crypto libraries (https://en.wikipedia.org/wiki/BLAKE_%28hash_function%29#BLAKE2_uses).

    Here is the official reference implementation: https://github.com/BLAKE2

    Here are some Python modules (wrappers or implementations):

    @ZookoWilcox-OHearn ZookoWilcox-OHearn mannequin added the stdlib Python modules in the Lib dir label Apr 18, 2016
    @ZookoWilcox-OHearn
    Copy link
    Mannequin Author

    ZookoWilcox-OHearn mannequin commented Apr 18, 2016

    Oh, just to be clear, I didn't mean to imply that BLAKE2 is _less_ safe than SHA-3. My best estimate is that BLAKE2 and SHA-3 are equivalently safe, and that either of them is safer than SHA-2, SHA-1, or MD5.

    @alex
    Copy link
    Member

    alex commented Apr 18, 2016

    Right now all the hashlib algorithms are backed by OpenSSL. OpenSSL 1.1.0 will have blake2, so perhaps the right move is just to wait for that to drop in a few weeks?

    Sadly many users with old OpenSSL's still won't have blake2, but pretty quickly Windows and OS X users will get blake2!

    @dstufft
    Copy link
    Member

    dstufft commented Apr 18, 2016

    Right now all the hashlib algorithms are backed by OpenSSL.

    As far as I know, hashlib ships it's own implementations of anything that is a guaranteed algorithms (currently md5, sha1, and sha2, presumably sha3 too once that gets added).

    So I guess one question is whether we want to guarantee the existence of blake2 or not.

    @tiran
    Copy link
    Member

    tiran commented Apr 18, 2016

    I have SHA-3, SHAKE and BLAKE2s / BLAKE2b on my radar.

    PEP-247 and the current API definition of the hashlib module is too limited for the new hashing algorithms. It lacks variable output for SHAKE as well as salt and personalization for BLAKE. I started to work on PEP-452 while I was working on SHA-3 support for Python 3. The PEP is still work-in-progress. I can address the needs of BLAKE and submit the PEP to a formal review.

    I'm already working on OpenSSL 1.1.0 support for Python. blake will be automatically supported with OpenSSL builds. See for yourself:

    $ ./python 
    Python 3.6.0a0 (default, Apr 18 2016, 21:16:54) 
    [GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import hashlib
    >>> import ssl
    >>> ssl.OPENSSL_VERSION
    'OpenSSL 1.1.0-pre5-dev  xx XXX xxxx'
    >>> hashlib.new('BLAKE2s256').hexdigest()
    '69217a3079908094e11121d042354a7c1f55b6482ca1a51e1b250dfd1ed0eef9'

    Donald:
    I love to add SHA-3 support to Python. After all I had a working patch for SHA-3 before NIST changed the padding. Now my old patch is worthless. The new reference implementations are either too complicated (spread across like twenty files for various optimizations) or are one-shot implementations. I'm still looking for a good implementation that supports incremental updates, copy and SHAKE at the same time.

    @gpshead
    Copy link
    Member

    gpshead commented Apr 18, 2016

    CPython should not attempt make a judgement about the safety of a particular function. We can only document if something has known issues.

    We should only include things in the stdlib which are either (a) standard or (b) widely used regardless of being standard. Given that librsync and RAR both support BLAKE2 and that it is defined by an RFC, I'd say it should go in.

    The reference implementation appears to have Apache 2.0 licensed C code would can fall back to when the OpenSSL in use does not include it.

    PEP-452 seems to address the necessary API changes.

    @gpshead gpshead added the type-feature A feature request or enhancement label Apr 18, 2016
    @tiran
    Copy link
    Member

    tiran commented Apr 18, 2016

    BLAKE2 is under CC0 1.0 Universal, https://github.com/BLAKE2/libb2

    OpenSSL 1.1.0 has no API to set salt, personal, digest length and key length (for keyed BLAKE2). I have asked Richard Salz and MJC if they'd accept a patch.

    Or I could ask Dmitry Chestnykh if he is interested to submit his pyblake2 module to the stdlib. It's under CC0. It looks pretty good except for his macro tricks. I can cope. :)

    @gpshead
    Copy link
    Member

    gpshead commented Apr 19, 2016

    Confirm that the CC license is valid/acceptable with Van L. first. That's
    why I pointed at the Apache 2 code, already a known good license. :)

    On Mon, Apr 18, 2016, 2:51 PM Christian Heimes <report@bugs.python.org>
    wrote:

    Christian Heimes added the comment:

    BLAKE2 is under CC0 1.0 Universal, https://github.com/BLAKE2/libb2

    OpenSSL 1.1.0 has no API to set salt, personal, digest length and key
    length (for keyed BLAKE2). I have asked Richard Salz and MJC if they'd
    accept a patch.

    Or I could ask Dmitry Chestnykh if he is interested to submit his pyblake2
    module to the stdlib. It's under CC0. It looks pretty good except for his
    macro tricks. I can cope. :)

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue26798\>


    @tiran
    Copy link
    Member

    tiran commented Apr 20, 2016

    First experimental version of blake2b and blake2s for Python 3.6:
    https://github.com/tiran/cpython/commits/feature/blake2

    The code is based on Dmitry Chestnykh's pyblake2 but modified to support argument clinic. I had to replace the macro magic with plain C code. The clinic is not able to deal with the macros. blake2s.c is auto-generated from blake2b.c

    @tiran
    Copy link
    Member

    tiran commented May 8, 2016

    Here is my first patch. I have tested it on X86_64 (m64 and m32) and ARMv7. It should compile on Windows but I don't have a working Windows box on my box.

    Dmitry, I have copied your documentation. Are you fine with that?

    TODO:

    • set BLAKE2_USE_SSE on win32 X86_64
    • more test vectorss for advanced use cases (salt, personal, tree hashing)

    @dchest
    Copy link
    Mannequin

    dchest mannequin commented May 8, 2016

    Christian: yes, and I'm also happy that you kept the drawing of hash tree, as it helps a lot with understanding of terminology.

    I had a quick look at the patch and it looks good to me.

    Some comments, which you can ignore:

    In keyed hashing example there's:

    +    >>> def verify(cookie, sig):
    +    ...     good_sig = sign(cookie)
    +    ...     if len(sig) != len(good_sig):
    +    ...          return False
    +    ...     # Use constant-time comparison to avoid timing attacks.
    +    ...     result = 0
    +    ...     for x, y in zip(sig, good_sig):
    +    ...         result |= ord(x) ^ ord(y)
    +    ...     return result == 0

    Perhaps, you can replace comparison with hmac.compare_digest(sig, goodsig) now that we have it in Python.

    • Salted hashing (or just hashing) with BLAKE2 or any other general-purpose
    • cryptographic hash function, such as SHA-256, is not suitable for hashing
    • passwords. See BLAKE2 FAQ <https://blake2.net/#qa>_ for more
    • information.

    Maybe also point readers to hashlib.html#key-derivation

    +On platforms with support for 64bit integer types (some 32bit platforms,
    +all 64bit platforms), blake2b and blake2s are supported.

    Theoretically, blake2s shouldn't require 64bit int, reference code only uses it for sizes -- is this something worth fixing? Are there platforms that have uint32_t, but not uint64_t?

    @@ -162,7 +192,7 @@ class HashLibTestCase(unittest.TestCase):
             for cons in self.hash_constructors:
                 h = cons()
                 self.assertIsInstance(h.name, str)
    -            self.assertIn(h.name, self.supported_hash_names)
    +            #self.assertIn(h.name, self.supported_hash_names)
                 self.assertEqual(h.name, hashlib.new(h.name).name)

    Was this commented-out on purpose? Also, in setup.py:

    + #if os.uname().machine == "x86_64":
    + # Every x86_64 machine has at least SSE2.
    + #blake2_macros.append(('BLAKE2_USE_SSE', '1'))

    @tiran
    Copy link
    Member

    tiran commented May 8, 2016

    Thanks for your review, Dmitry.

    I have replaced verify() with compare_digest().

    Python requires a C89 compatible compiler and 32bit architecture. C89 doesn't mandate 64bit integers. As far as I remember there is (or was) one buildbot with a compiler, that doesn't have 64 ints on an old SPARC system. All major platforms have 64bit ints. I might modify the implementation when the patch has landed.

    #self.assertIn(h.name, self.supported_hash_names)
    I now check for guaranteed and eventually supported hashes.

    SSE is enabled on X64_86. I forgot to remove the comments.

    The test suite is missing tests for salt, personal and tree hashing. I have asked Zooko and JPA for vectors.

    @dchest
    Copy link
    Mannequin

    dchest mannequin commented May 8, 2016

    I have replaced verify() with compare_digest().

    +    >>> compare_digesty(cookie, '0102030405060708090a0b0c0d0e0f00')

    Typo here. Also, this doesn't look like it compares the digest. Maybe you can keep the verify() function, but make it use compare_digest() -- this looks more clear to me:

    >>> def verify(cookie, sig):
    ...     good_sig = sign(cookie)
    ...     return compare_digest(goodsig, sig)

    Python requires a C89 compatible compiler and 32bit architecture. C89 doesn't mandate 64bit integers. As far as I remember there is (or was) one buildbot with a compiler, that doesn't have 64 ints on an old SPARC system. All major platforms have 64bit ints. I might modify the implementation when the patch has landed.

    Oh, I see. Thanks!

    @tiran
    Copy link
    Member

    tiran commented Jun 2, 2016

    New patch:

    • I moved the test vectors out of the repos. They are currently hosted on github. I'll move them to pythontest infra later.

    • I dropped special case for systems w/o uint64. Python 3.6+ requires 64 bit int types to compile.

    @tiran tiran removed their assignment Jun 12, 2016
    @tiran
    Copy link
    Member

    tiran commented Aug 20, 2016

    Patch 4 uses the latest revision of the reference implementation.

    @vadmium
    Copy link
    Member

    vadmium commented Sep 4, 2016

    Maybe call the RST file hashlib-blake2.rst (hyphen, not dot), otherwise it looks like it is a separate submodule under the hashlib package.

    Also there seems to be a versionadded tag missing in the documentation.

    In setup.py, the os.uname().machine check seems unsafe for cross-compilation, though I am not certain, and I don’t have a better suggestion.

    @tiran
    Copy link
    Member

    tiran commented Sep 4, 2016

    Thanks for your review. I have addressed your points and updated/fixed/renamed documentation and comments.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset 4969f6d343b1 by Christian Heimes in branch 'default':
    Issue bpo-26798: Add BLAKE2 (blake2b and blake2s) to hashlib.
    https://hg.python.org/cpython/rev/4969f6d343b1

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset be6f3449ac13 by Christian Heimes in branch 'default':
    Issue bpo-26798: for loop initial declarations are only allowed in C99 or C11 mode
    https://hg.python.org/cpython/rev/be6f3449ac13

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset 5c31599de76a by Christian Heimes in branch 'default':
    Issue bpo-26798: for loop initial declarations, take 2
    https://hg.python.org/cpython/rev/5c31599de76a

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset a1e032dbcf86 by Christian Heimes in branch 'default':
    Issue bpo-26798: for loop initial declarations, take 3
    https://hg.python.org/cpython/rev/a1e032dbcf86

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset afa5a16456ed by Christian Heimes in branch 'default':
    Issue bpo-26798: Hello Winndows, my old friend. I've come to fix blake2 for you again.
    https://hg.python.org/cpython/rev/afa5a16456ed

    @vadmium
    Copy link
    Member

    vadmium commented Sep 8, 2016

    Seems the test fails if the installed Python tree is not writable:
    http://buildbot.python.org/all/builders/x86%20Gentoo%20Installed%20with%20X%203.x/builds/1005/steps/test/logs/stdio
    ======================================================================
    ERROR: test_blake2b_vectors (test.test_hashlib.HashLibTestCase)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/test_hashlib.py", line 537, in test_blake2b_vectors
        for msg, key, md in read_vectors('blake2b'):
      File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/test_hashlib.py", line 50, in read_vectors
        with support.open_urlresource(URL.format(hash_name)) as f:
      File "/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/support/__init__.py", line 1061, in open_urlresource
        with open(fn, "wb") as out:
    PermissionError: [Errno 13] Permission denied: '/buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/target/lib/python3.6/test/data/blake2b.txt'

    ======================================================================
    ERROR: test_blake2s_vectors (test.test_hashlib.HashLibTestCase)

    @vadmium
    Copy link
    Member

    vadmium commented Sep 8, 2016

    .
    Looks like bpo-28001 is open about the same problem. But I presume other tests use open_urlresource(), so why don’t they also fail?

    @tiran
    Copy link
    Member

    tiran commented Sep 8, 2016

    Other tests catch OSError and HTTPError. I have changed read_vectors() to do the same for now.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 8, 2016

    New changeset 46b34706eb41 by Christian Heimes in branch 'default':
    bpo-26798: fetch OSError and HTTPException like other tests that use open_urlresource.
    https://hg.python.org/cpython/rev/46b34706eb41

    @tiran tiran closed this as completed Sep 8, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 8, 2016

    New changeset fa89fff0b52c by Christian Heimes in branch 'default':
    Issue bpo-26798: Coverity complains about potential memcpy() of overlapped regions. It doesn't hurt to use memmove() here. CID 1372514 / CID 1372515. Upstream BLAKE2/BLAKE2#32
    https://hg.python.org/cpython/rev/fa89fff0b52c

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants