Message 257964 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	cowlicks
Recipients	Ramchandra Apte, abarnert, christian.heimes, cowlicks, georg.brandl, gvanrossum, josh.r, martin.panter, pitrou, rhettinger, serhiy.storchaka, socketpair, terry.reedy, vstinner
Date	2016-01-11.14:58:49
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1452524329.84.0.606485647705.issue19251@psf.upfronthosting.co.za>
In-reply-to

Content
@Andrew Barnert > Maybe if you're coming to Python from... I'm not sure if your trying argue that my expectations are unusual? Python is my first programming language. To reiterate: I expected cpython to support bitwise operations on binary data. I don't think that is so strange. No I have not looked at PyPi. What I did was have an idea to do this, and there happened to be an open bug on it that needed a patch. So I wrote one. And yes, I realize NumPy can do this, but it is still a very large dependency. Anyway, here are some random projects which would look a lot nicer with this: An implementation of the blake2 hash function in pure python. Consider this line: https://github.com/buggywhip/blake2_py/blob/master/blake2.py#L234 self.h = [self.h[i] ^ v[i] ^ v[i+8] for i in range(8)] Which would become something like: self.h ^= v[:8] ^ v[8:] Which is much easier to read and much faster. Or consider this function from this aes implementation: https://github.com/bozhu/AES-Python/blob/master/aes.py#L194-L201 def __mix_single_column(self, a): # please see Sec 4.1.2 in The Design of Rijndael t = a[0] ^ a[1] ^ a[2] ^ a[3] u = a[0] a[0] ^= t ^ xtime(a[0] ^ a[1]) a[1] ^= t ^ xtime(a[1] ^ a[2]) a[2] ^= t ^ xtime(a[2] ^ a[3]) a[3] ^= t ^ xtime(a[3] ^ u) This would become something like: def __mix_single_column(self, a): a ^= a ^ xtime(a ^ (a[1:] + a[0:1])) Clearer and faster. Another piece of code this would improve: https://github.com/mgoffin/keccak-python/blob/master/Keccak.py#L196-L209 These were easy to find so I'm sure there are more. I think these demonstrate that despite what people should be doing, they are doing things in a way that could be substantially improved with this patch. This does resemble NumPy's vectorized functions, but it is much more limited in scope since there is no broadcasting involved. Here is a quick benchmark: $ ./python -m timeit -n 100000 -s "a=b'a'64; b=b'b'64" "(int.from_bytes(a, 'little') ^ int.from_bytes(b, 'little')).to_bytes(64, 'little')" 100000 loops, best of 3: 0.942 usec per loop $ ./python -m timeit -n 100000 -s "a=b'a'64; b=b'b'64" "a ^ b" 100000 loops, best of 3: 0.041 usec per loop NumPy is the slowest but I'm probably doing something wrong, and its in ipython so I'm not timing the import: In [13]: %timeit bytes(np.frombuffer(b'b'64, dtype=np.int8) ^ np.frombuffer(b'a'64, dtype=np.int8)) 100000 loops, best of 3: 3.69 µs per loop About 20 times faster,

@Andrew Barnert
> Maybe if you're coming to Python from...
I'm not sure if your trying argue that my expectations are unusual? Python is my first programming language. To reiterate: I expected cpython to support bitwise operations on binary data. I don't think that is so strange.

No I have not looked at PyPi. What I did was have an idea to do this, and there happened to be an open bug on it that needed a patch. So I wrote one.

And yes, I realize NumPy can do this, but it is still a very large dependency.

Anyway, here are some random projects which would look a lot nicer with this:

An implementation of the blake2 hash function in pure python. Consider this line:
https://github.com/buggywhip/blake2_py/blob/master/blake2.py#L234

self.h = [self.h[i] ^ v[i] ^ v[i+8] for i in range(8)]

Which would become something like:

self.h ^= v[:8] ^ v[8:]

Which is much easier to read and much faster.

Or consider this function from this aes implementation:
https://github.com/bozhu/AES-Python/blob/master/aes.py#L194-L201

    def __mix_single_column(self, a):
        # please see Sec 4.1.2 in The Design of Rijndael
        t = a[0] ^ a[1] ^ a[2] ^ a[3]
        u = a[0]
        a[0] ^= t ^ xtime(a[0] ^ a[1])
        a[1] ^= t ^ xtime(a[1] ^ a[2])
        a[2] ^= t ^ xtime(a[2] ^ a[3])
        a[3] ^= t ^ xtime(a[3] ^ u)

This would become something like:

def __mix_single_column(self, a):
    a ^= a ^ xtime(a ^ (a[1:] + a[0:1]))

Clearer and faster. 

Another piece of code this would improve:
https://github.com/mgoffin/keccak-python/blob/master/Keccak.py#L196-L209

These were easy to find so I'm sure there are more. I think these demonstrate that despite what people *should* be doing, they are doing things in a way that could be substantially improved with this patch.

This does resemble NumPy's vectorized functions, but it is much more limited in scope since there is no broadcasting involved.

Here is a quick benchmark:

$ ./python -m timeit -n 100000 -s "a=b'a'*64; b=b'b'*64" "(int.from_bytes(a, 'little') ^ int.from_bytes(b, 'little')).to_bytes(64, 'little')"
100000 loops, best of 3: 0.942 usec per loop

$ ./python -m timeit -n 100000 -s "a=b'a'*64; b=b'b'*64" "a ^ b"
100000 loops, best of 3: 0.041 usec per loop

NumPy is the slowest but I'm probably doing something wrong, and its in ipython so I'm not timing the import:

In [13]: %timeit bytes(np.frombuffer(b'b'*64, dtype=np.int8) ^ np.frombuffer(b'a'*64, dtype=np.int8))
100000 loops, best of 3: 3.69 µs per loop

About 20 times faster,

History
Date	User	Action	Args
2016-01-11 14:58:50	cowlicks	set	recipients: + cowlicks, gvanrossum, georg.brandl, rhettinger, terry.reedy, pitrou, vstinner, christian.heimes, socketpair, Ramchandra Apte, martin.panter, serhiy.storchaka, abarnert, josh.r
2016-01-11 14:58:49	cowlicks	set	messageid: <1452524329.84.0.606485647705.issue19251@psf.upfronthosting.co.za>
2016-01-11 14:58:49	cowlicks	link	issue19251 messages
2016-01-11 14:58:49	cowlicks	create