This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author larry
Recipients Zooko.Wilcox-O'Hearn, christian.heimes, corona10, jstasiak, kmaork, larry, mgorny, oconnor663, xtreak
Date 2022-03-05.07:01:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1646463672.85.0.997884940707.issue39298@roundup.psfhosted.org>
In-reply-to
Content
Jack O'Connor:
> Was any of the experimental C extension code [...] useful to you?
> I was wondering if it could be possible to copy blake3module.c from
> there verbatim.

I concede I didn't even look at it.  The glue code to mate the library with the CPython runtime wasn't the hard part.  And, Python has its own way of working (Argument Clinic etc).  I did what seemed easiest, which
was to start with CPython's BLAKE2 support and hack it up.  Given that
I had it working in maybe half a day, this seems reasonable.

I should take a peek at your experimental build though, just to confirm
I got the interfaces right.


> The setup.py build there also has working Windows and NEON support.

The CPython build process doesn't use setup.py to build extensions on
Windows.  I haven't looked recently, but back in the day they were
built like any other DLL, using its own "project file".  This will
require someone to use the MSVS GUI to create a new project, add files,
etc etc.  I assume they can just use the .asm files for the SIMD
extensions so hopefully this will be straightforward.

As for NEON support, the technique I used theoretically should Just Work.
I look forward to hearing back from someone building on ARM.  (I have
a cross-compiler setup for building for MiSTer, which is an ARM platform
with NEON support.  But the cross-compiler is a royal PITA to use...
it's a Docker image, and requires a bunch of manual configuration,
and I haven't touched any of that in more than a year.)


You seem to have done at least a partial code review of my PR, given
your comments to follow.  I appreciate it!  I tried to add you as a
"reviewer" but GitHub wouldn't permit it--I assume this is some sort
of permissions problem.  Still, it'd be nice if you were able to do
future code reviews using the GitHub review tool; are you permitted to
use that for my PR? 

> - My derive_key_context parameter requires a string and refuses to
> accept bytes. This is consistent with our Rust and C APIs (though the C
> API does include a _raw version specifically for bindings, which we're
> using here).

I was considering going the other way with it actually, requiring bytes.
Note that Python has first-class support for hard-coded bytes strings:

b = blake3.blake3(derive_key_context=b"My Funny Valentine (1984)")

The C interface takes "char *", not a "wchar_t *", and this seemed like
the most direct and relatable way to reflect that.  But I'm not militant
about this, and I'm willing to change the interface to require an actual
string (as in Unicode).  I note that your C API already dictates that
Unicode be encoded as UTF-8, so we can do that, and if the encoding fails
the user can deal with it.


> - I've included an `AUTO` constant that provides a special value (-1)
> for the `max_threads` argument, and I explicitly don't support
> `max_threads=0`.

I can do that too; again, I prefer the 0 there, but I'm not militant about
it.  However, it would make sense to me if you had that constant somewhere
in the BLAKE3 C .h files, which AFAICT you currently don't.


> - I enforce that the `data` arguments are positional-only and that the
> other keyword arguments are keyword-only. I think this is consistent
> with the rest of hashlib.

I suspect hashlib is mostly like that, due to being chock full of
legacy code.  But I don't see why that's necessary.  I think permitting
"data" to be a named argument is fine.  So unless you have a strong
conviction about it--which I bet you don't--I'll leave it as
positional-or-keyword.

There are rare circumstances where positional-only arguments are useful;
this isn't one of them.


> - I include a `.reset()` method.

I don't mind adding that.


> - Unrelated to tests: I haven't made any attempt to zero memory in my
> `dealloc` function. But if that's what other hashlib functions do,
> then I'm certainly in favor of doing it here too.

I inherited that from the BLAKE2 code I carved up to make the BLAKE3
version.  And yeah, it made sense to me, so I kept it.


Christian Heimes:
> GH-31686 is a massive patch set. I'm feeling uncomfortable adding
> such much new code for a new hashing algorithm. Did you ask the
> Steering Council for approval?

I didn't.  Like most hashing algorithms, BLAKE3 doesn't allocate
memory and doesn't perform any I/O.  All it does is meditate on
the data you pass in, and write to various pre-allocated fixed-size
buffers.  As large codebases go this seems pretty harmless, almost inert.

The Modules/_blake3/impl directory is about 32kloc.  I note that the
Modules/_blake2/impl directory you checked in in 2016 is about 21kloc,
and you didn't require Steering Council (or BDFL) approval for that.

As (former) Steering Council member Barry Warsaw says: JFDI!


> The platform detection and compiler flag logic must be added to
> configure.ac instead of setup.py. Erlend and I are in the process
> of making setup.py optional. I plan to remove it entirely along
> with distutils in 3.12.

"must"?  What empowers you to make such an edict?

Currently extensions are built with setup.py, full stop.  There are
configuration settings in setup.py for extension modules--which my
patch extends to support "blake3"--but none of the actual building
work is done inside configure.

My patch works within the current build system, and I see nothing
inappropriate about how I solved the problem.  Furthermore,
I don't know anything about modifying configure.ac files, and I'm
not going to learn it just to get this patch merged.

If you feel strongly about it I encourage you to submit changes
to my PR.  Or, wait until the 3.12 development cycle, when you say
you plan to rewrite it all anyway.  But it's not necessary to move
the platform detection and compiler flag logic into configure
before merging my PR.
History
Date User Action Args
2022-03-05 07:01:12larrysetrecipients: + larry, christian.heimes, mgorny, Zooko.Wilcox-O'Hearn, jstasiak, oconnor663, corona10, xtreak, kmaork
2022-03-05 07:01:12larrysetmessageid: <1646463672.85.0.997884940707.issue39298@roundup.psfhosted.org>
2022-03-05 07:01:12larrylinkissue39298 messages
2022-03-05 07:01:12larrycreate