classification
Title: ctypes: memoryview gives incorrect PEP3118 format strings for both packed and unpacked structs
Type: behavior Stage: patch review
Components: ctypes Versions: Python 3.8, Python 3.7, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: Eric.Wieser, aerojockey, amaury.forgeotdarc, belopolsky, meador.inge, skrah, teoliphant, terry.reedy
Priority: normal Keywords: patch

Created on 2018-02-06 05:27 by Eric.Wieser, last changed 2019-04-21 07:51 by skrah.

Pull Requests
URL Status Linked Edit
PR 5561 open Eric.Wieser, 2018-02-06 05:33
Messages (7)
msg311705 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-02-06 05:27
Discovered [here](https://github.com/numpy/numpy/issues/10528)

Consider the following structure, and a memoryview created from it:

    class foo(ctypes.Structure):
        _fields_ = [('one', ctypes.c_uint8),
                    ('two', ctypes.c_uint32)]
    f = foo()
    mf = memoryview(f)

We'd expect this to insert padding, and it does:

>>> mf.itemsize
8

But that padding doesn't show up in the format string:

>>> mf.format
'T{<B:one:<I:two:}'

That format string describes the _packed_ version of the struct, with the `two` field starting at offset 1, based on the `struct` documentation on how `<` should be interpreted:

> No padding is added when using non-native size and alignment, e.g. with ‘<’, ‘>’

But ctypes doesn't even get it right for packed structs:


    class foop(ctypes.Structure):
        _fields_ = [('one', ctypes.c_uint8),
                    ('two', ctypes.c_uint32)]
        _pack_ = 1
    f = foo()
    mf = memoryview(f)

The size is what we'd expect:

>>> mf.itemsize
5

But the format is garbage:

>>> mf.format
'B'  # sizeof(byte) == 5!?
msg321279 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-07-08 17:37
Pinging, as recommended by https://devguide.python.org/pullrequest/#reviewing. PEP3118 as a protocol is far less useful if the canonical implementation is non-compliant.
msg321280 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2018-07-08 18:04
Unfortunately, the PEP authors did very little in terms of implementing the PEP and neither CPython nor numpy has a fully compliant implementation.
msg321730 - (view) Author: Carl Banks (aerojockey) Date: 2018-07-16 10:30
I guess I'll weigh in since I was pinged.

I agree with the approach in the patch.  All the memoryview does is to use the format field verbatim from the underlying buffer, so if the format field is inaccurate then the only thing to do is to fix the object providing the buffer.

Since the format field is only there for interpretation of the data, and is not used to calculate of itemsizes or strides anywhere as far as I know, it's a fairly low-risk change.

However, the patch still leaves ctypes inaccurate for the case of unions.  It should be fairly simple to modify the code to use a format of "B<size>" for unions, so that it at least matches the itemsize, even if the type information is lost.

(As an aside, let me point out that I did not actually write or advocate for the PEP; for some reason my name was added to it even though I all I did was to provide feedback.)
msg322225 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-07-23 14:19
> It should be fairly simple to modify the code to use a format of "B<size>" for unions, so that it at least matches the itemsize

Seems reasonable, although:

* I think it should be "<size>B" or "(<size>)B"
* I'd rather leave that for a later patch. While it would be correct, it's still not correct enough to be that useful, since ultimately the union layout is still lost. I'd prefer to focus on fixing the part of the PEPE3118 implementation that is most useful, rather than committing to fixing the whole thing all at once.
msg340234 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-04-14 20:38
Fixing one case is better than fixing no cases.
msg340603 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-04-21 07:51
Since Terry added me: Yes, this is clearly a bug, but it is a ctypes issue and not a memoryview issue.

ctypes issues unfortunately tend to take some time until someone reviews.
History
Date User Action Args
2019-04-21 07:51:01skrahsetmessages: + msg340603
2019-04-14 20:38:47terry.reedysetnosy: + skrah, terry.reedy
messages: + msg340234
2019-04-14 18:31:05terry.reedysetversions: - Python 3.6
2018-07-23 14:19:25Eric.Wiesersetmessages: + msg322225
2018-07-16 10:30:42aerojockeysetmessages: + msg321730
2018-07-08 18:06:20belopolskysetassignee: belopolsky
2018-07-08 18:04:40belopolskysetnosy: + teoliphant, aerojockey
messages: + msg321280
2018-07-08 17:37:15Eric.Wiesersetmessages: + msg321279
2018-02-10 03:39:11terry.reedysetnosy: + amaury.forgeotdarc, belopolsky, meador.inge

versions: - Python 3.4, Python 3.5
2018-02-06 05:40:05Eric.Wiesersettype: behavior
2018-02-06 05:33:47Eric.Wiesersetkeywords: + patch
stage: patch review
pull_requests: + pull_request5383
2018-02-06 05:27:59Eric.Wiesercreate