This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author oconnor663
Recipients Zooko.Wilcox-O'Hearn, christian.heimes, corona10, jstasiak, kmaork, larry, mgorny, oconnor663, xtreak
Date 2022-03-04.18:16:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1646417787.58.0.561666407281.issue39298@roundup.psfhosted.org>
In-reply-to
Content
Thanks Larry! Was any of the experimental C extension code under https://github.com/oconnor663/blake3-py/tree/master/c_impl useful to you? I was wondering if it could be possible to copy blake3module.c from there verbatim. The setup.py build there also has working Windows and NEON support.

I've patched the blake3-py test suite (which both the production Rust-based extension and that experimental C-based extension currently pass) to invoke the new hashlib implementation from your branch. You can find the full test output, and the procedure I used to run the tests, in this Gist https://gist.github.com/oconnor663/533048580b1c0f4a01d1d55f57f92792. Here are some differences:

- My derive_key_context parameter requires a string and refuses to accept bytes. This is consistent with our Rust and C APIs (though the C API does include a _raw version specifically for bindings, which we're using here). For a long discussion of why we prefer to do things this way, see https://github.com/BLAKE3-team/BLAKE3/issues/13. The short version is that any use case that requires arbitrary bytes for the context string is almost certainly violating the documented security requirement that the context string must be hardcoded.

- I've included an `AUTO` constant that provides a special value (-1) for the `max_threads` argument, and I explicitly don't support `max_threads=0`.

- I enforce that the `data` arguments are positional-only and that the other keyword arguments are keyword-only. I think this is consistent with the rest of hashlib.

- I include a `.reset()` method. This isn't particularly useful in the default case, where you might as well create a new hasher. But when `max_threads` is greater than 1 in the Rust implementation, the hasher owns a thread pool, and `.reset()` is currently the only way to reuse that pool. (A BLAKE3 hasher is also ~2 KB, somewhat larger than other hashers, so callers who are pinching pennies with their allocator traffic might prefer to reuse the object.)

- Unrelated to tests: I haven't made any attempt to zero memory in my `dealloc` function. But if that's what other hashlib functions do, then I'm certainly in favor of doing it here too.
History
Date User Action Args
2022-03-04 18:16:28oconnor663setrecipients: + oconnor663, larry, christian.heimes, mgorny, Zooko.Wilcox-O'Hearn, jstasiak, corona10, xtreak, kmaork
2022-03-04 18:16:27oconnor663setmessageid: <1646417787.58.0.561666407281.issue39298@roundup.psfhosted.org>
2022-03-04 18:16:27oconnor663linkissue39298 messages
2022-03-04 18:16:27oconnor663create