This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author oconnor663
Recipients Zooko.Wilcox-O'Hearn, christian.heimes, corona10, jstasiak, kmaork, larry, mgorny, oconnor663, xtreak
Date 2022-01-13.02:08:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I was about to say the only missing feature was docstrings, but then I realized I hadn't included releasing the GIL. I've added that and pushed an update just now. Fingers crossed there's nothing else I've missed. I think it's in reasonably good shape, and I'd like to propose it for inclusion in 3.11.

However, I'm not very experienced with or the Python C API, so I might not be the best judge of shape. Here are some highlights for reviewers, where I think the implementation (mostly the build) could be shaky:

- Platform detection. In I assume that the target platform of the build is the same as the current interpreter's platform. So for example, if the current interpreter has sys.maxsize==2**31-1, I assume we're compiling for 32-bit. This clearly doesn't allow for any sort of cross-compilation, and in general I need feedback about whether there's a more correct way to obtain the target platform.

- Compiling assembly files. On Unix this is easy, because we can supply *.S files as `extra_objects` and GCC/Clang will do the right thing. But on Windows this isn't easy. Currently I shell out to vswhere.exe, ask it for the path to the latest version of the ml64.exe assembler, and then shell out to that to build .obj files. Then I pass those assembled .obj files as `extra_objects`. This feels awfully manual, and there's a good chance I've missed some better-supported way to do it. I assume we don't want to check in the .obj files?

- Does Python support the GNU ABI on Windows? We have assembly files for this in vendor/, but I'm not currently building them.

- Compiling intrinsics files for 32-bit x86. In this case, I create a `ccompiler.new_compiler()` for each intrinsics file, so that I can set the appropriate flags for each. This is relatively clean, but it leads to things getting rebuilt every single time, rather than participating in ` build` caching. Maybe nobody cares about this, but again it makes me think there might be a better-supported way to do it.

- blake3module.c contains an awful lot of gotos to handle allocation failure cases. Is this still considered a best practice? These are bug-prone, and I'm not sure how to test them.

- Existing hashlib implementations include an optimization where they avoid allocating an internal mutex until they see a long input and want to release the GIL. As a quirky side effect of this, they handle allocation failures for that mutex by falling back to the do-not-release-the-GIL codepath. That feels kind of complicated to me, and in my code I'm always allocating the mutex during initialization. This doesn't seem to make much of a performance difference when I measure it, but there might be use cases I haven't considered.

Here are some other API details that might be worth bikeshedding:

- The `max_threads` parameter is currently defined to take a special value, `blake3.AUTO`, to indicate that the implementation may use as many threads as it likes. (The C implementation doesn't include multithreading support, but it's API-compatible with the Rust implementation.) `blake3.AUTO` is currently a class attribute equal to -1. We may want to bikeshed this name or propose some other representation.

- BLAKE3 has three modes: regular hashing, keyed hashing, and key derivation. The keyed hashing mode takes a 32-byte key, and the key derivation mode takes a context string. Calling the 32-byte key `key` seems good. Calling the context string `context` seems less good. Larry has pointed out before that lots of random things are called `context`, and readers might not understand what they're seeing. I currently call it `derive_key_context` instead, but we could bikeshed this.

- I check `itemsize` on input Py_buffers and throw an exception if it's anything other than 1. My test suite exercises this, see `test_int_array_fails`. However, some (all?) standard hashes don't do this check. For example:

    >>> from hashlib import sha256
    >>> import array
    >>> a = array.array("i", [255])
    >>> sha256(a).hexdigest()
    >>> sha256(bytes([0xff, 0, 0, 0])).hexdigest()

Here we can see sha256() hashing an array of int. On my machine, an int is 4 little-endian bytes, but of course this sort of thing isn't portable. The same array will result in a different SHA-256 output on a big-endian machine, or on a machine with ints of a different size. This seems undesirable, and I'm surprised that hashlib allows it. However, if there's some known compatibility reason why we have to allow it, I could remove this check.
Date User Action Args
2022-01-13 02:08:50oconnor663setrecipients: + oconnor663, larry, christian.heimes, mgorny, Zooko.Wilcox-O'Hearn, jstasiak, corona10, xtreak, kmaork
2022-01-13 02:08:50oconnor663setmessageid: <>
2022-01-13 02:08:50oconnor663linkissue39298 messages
2022-01-13 02:08:49oconnor663create