Author skrah
Recipients Arfrever, christian.heimes, georg.brandl, loewis, mark.dickinson, meador.inge, ncoghlan, pitrou, python-dev, skrah, vstinner
Date 2012-08-14.10:07:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1344938855.37.0.712662302607.issue15573@psf.upfronthosting.co.za>
In-reply-to
Content
Here is a patch implementing by-value comparisons for all format strings
understood by the struct module. It is slightly longer than promised, since
for larger arrays it is necessary to cache an unpacking object for acceptable
performance. The fast path for identical single element native format strings
is unchanged.

The new comparison rules are stated in the memoryview docs.


For Georg's benefit, here are the memoryobject.c changes and the reasons why
I think the patch can go into 3.3:

  o cmp_structure() is split into cmp_format() and cmp_shape(), with
    unchanged semantics.

  o The new section "unpack using the struct module" is largely identical
    to existing parts of _testbuffer.c:

      - struct_get_unpacker()  ==> see _testbuffer.c:ndarray_as_list()

      - struct_unpack_single() ==> see base case in _testbuffer.c:unpack_rec()

  o The new code is only called in the previous default case of unpack_cmp().

  o The new code has 100% coverage.



Performance:
============

Identical format, bytes:
------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('B', [1]*10000);" "x == y"
1000 loops, best of 3: 116 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('B', [1]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 49.1 usec per loop


Identical format, double:
-------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('d', [1.0]*10000);" "x == y"
1000 loops, best of 3: 319 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('d', [1.0]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 65.7 usec per loop


Different format ('B', 'b'):
----------------------------

$ ./python -m timeit -n 100 -s "import array; x = array.array('B', [1]*10000); y = array.array('b', [1]*10000);" "x == y"
100 loops, best of 3: 131 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('B', [1]*10000); y = array.array('b', [1]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 3.42 msec per loop


Different format ('d', 'f'):
----------------------------

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('f', [1.0]*10000);" "x == y"
1000 loops, best of 3: 315 usec per loop

$ ./python -m timeit -n 1000 -s "import array; x = array.array('d', [1.0]*10000); y = array.array('f', [1.0]*10000); a = memoryview(x); b = memoryview(y)" "a == b"
1000 loops, best of 3: 3.59 msec per loop
History
Date User Action Args
2012-08-14 10:07:36skrahsetrecipients: + skrah, loewis, georg.brandl, mark.dickinson, ncoghlan, pitrou, vstinner, christian.heimes, Arfrever, meador.inge, python-dev
2012-08-14 10:07:35skrahsetmessageid: <1344938855.37.0.712662302607.issue15573@psf.upfronthosting.co.za>
2012-08-14 10:07:34skrahlinkissue15573 messages
2012-08-14 10:07:34skrahcreate