Author skrah
Recipients Arfrever, christian.heimes, georg.brandl, loewis, mark.dickinson, meador.inge, ncoghlan, pitrou, python-dev, skrah, vstinner
Date 2012-08-10.16:46:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <20120810164643.GA21311@sleipnir.bytereef.org>
In-reply-to <1344614654.36.0.275586824245.issue15573@psf.upfronthosting.co.za>
Content
> 1. what does it mean that the formats of v and w are equal?

I'm using array and Py_buffer interchangeably since a Py_buffer struct
actually describes a multi-dimensional array. v and w are Py_buffer
structs.

So v.format must equal w.format, where format is a format string in
struct module syntax. The topic of this issue is to determine under
what circumstances two strings in struct module syntax are considered
equal.

> 2. Victor's clarification about this issue isn't about comparing
>    two arrays, but an array with a string object. So: when is an
>    array equal to some other (non-array) object?

>>> a=array.array('u', 'abc')
>>> v=memoryview(a)
>>> a == v
False

memoryview can compare against any object with a getbufferproc, in this
case array.array. memoryview_richcompare() calls PyObject_GetBuffer(other)
and proceeds to compare its own internal Py_buffer v against the obtained
Py_buffer w.

In the case of v.format == w.format the fix for unknown formats is trivial:
Just allow the comparison using v.itemsize == w.itemsize.

However, the struct module format string syntax has multiple representations
for the exact same formats, which makes a general fmtcmp() function tricky
to write.

Hence my proposal to demand a strict canonical form for PEP-3118 format
strings, which would be a proper subset of struct module format strings.

Example: "1Q 1h 1h 0c" must be written as "Q2h"

The attached patch should largely implement this proposal. A canonical form
is perhaps not such a severe restriction, since format strings should usually
come from the exporting object. E.g. NumPy must translate its own internal
format to struct module syntax anyway.

Another option is to commit the patch that misses "1Q 1h 1h 0c" == "Q2h"
now and aim for a completely general fmtcmp() function later.

IMO any general fmtcmp() function should also be reasonably fast.
History
Date User Action Args
2012-08-10 16:46:44skrahsetrecipients: + skrah, loewis, georg.brandl, mark.dickinson, ncoghlan, pitrou, vstinner, christian.heimes, Arfrever, meador.inge, python-dev
2012-08-10 16:46:43skrahlinkissue15573 messages
2012-08-10 16:46:42skrahcreate