This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ctypes: memoryview gives incorrect PEP3118 format strings for both packed and unpacked structs
Type: behavior Stage: patch review
Components: ctypes Versions: Python 3.8, Python 3.7, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: belopolsky Nosy List: Eric.Wieser, aerojockey, amaury.forgeotdarc, belopolsky, mattip, meador.inge, skrah, teoliphant, terry.reedy, vinay.sajip
Priority: normal Keywords: patch

Created on 2018-02-06 05:27 by Eric.Wieser, last changed 2022-04-11 14:58 by admin.

Pull Requests
URL Status Linked Edit
PR 5561 open Eric.Wieser, 2018-02-06 05:33
Messages (8)
msg311705 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-02-06 05:27
Discovered [here](https://github.com/numpy/numpy/issues/10528)

Consider the following structure, and a memoryview created from it:

    class foo(ctypes.Structure):
        _fields_ = [('one', ctypes.c_uint8),
                    ('two', ctypes.c_uint32)]
    f = foo()
    mf = memoryview(f)

We'd expect this to insert padding, and it does:

>>> mf.itemsize
8

But that padding doesn't show up in the format string:

>>> mf.format
'T{<B:one:<I:two:}'

That format string describes the _packed_ version of the struct, with the `two` field starting at offset 1, based on the `struct` documentation on how `<` should be interpreted:

> No padding is added when using non-native size and alignment, e.g. with ‘<’, ‘>’

But ctypes doesn't even get it right for packed structs:


    class foop(ctypes.Structure):
        _fields_ = [('one', ctypes.c_uint8),
                    ('two', ctypes.c_uint32)]
        _pack_ = 1
    f = foo()
    mf = memoryview(f)

The size is what we'd expect:

>>> mf.itemsize
5

But the format is garbage:

>>> mf.format
'B'  # sizeof(byte) == 5!?
msg321279 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-07-08 17:37
Pinging, as recommended by https://devguide.python.org/pullrequest/#reviewing. PEP3118 as a protocol is far less useful if the canonical implementation is non-compliant.
msg321280 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2018-07-08 18:04
Unfortunately, the PEP authors did very little in terms of implementing the PEP and neither CPython nor numpy has a fully compliant implementation.
msg321730 - (view) Author: Carl Banks (aerojockey) Date: 2018-07-16 10:30
I guess I'll weigh in since I was pinged.

I agree with the approach in the patch.  All the memoryview does is to use the format field verbatim from the underlying buffer, so if the format field is inaccurate then the only thing to do is to fix the object providing the buffer.

Since the format field is only there for interpretation of the data, and is not used to calculate of itemsizes or strides anywhere as far as I know, it's a fairly low-risk change.

However, the patch still leaves ctypes inaccurate for the case of unions.  It should be fairly simple to modify the code to use a format of "B<size>" for unions, so that it at least matches the itemsize, even if the type information is lost.

(As an aside, let me point out that I did not actually write or advocate for the PEP; for some reason my name was added to it even though I all I did was to provide feedback.)
msg322225 - (view) Author: Eric Wieser (Eric.Wieser) * Date: 2018-07-23 14:19
> It should be fairly simple to modify the code to use a format of "B<size>" for unions, so that it at least matches the itemsize

Seems reasonable, although:

* I think it should be "<size>B" or "(<size>)B"
* I'd rather leave that for a later patch. While it would be correct, it's still not correct enough to be that useful, since ultimately the union layout is still lost. I'd prefer to focus on fixing the part of the PEPE3118 implementation that is most useful, rather than committing to fixing the whole thing all at once.
msg340234 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-04-14 20:38
Fixing one case is better than fixing no cases.
msg340603 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-04-21 07:51
Since Terry added me: Yes, this is clearly a bug, but it is a ctypes issue and not a memoryview issue.

ctypes issues unfortunately tend to take some time until someone reviews.
msg358806 - (view) Author: mattip (mattip) * Date: 2019-12-23 08:08
Is there a ctypes or PEP-3118 core-dev who could review the issue and the PR solution?
History
Date User Action Args
2022-04-11 14:58:57adminsetgithub: 76961
2019-12-23 17:24:49ned.deilysetnosy: + vinay.sajip
2019-12-23 08:08:04mattipsetnosy: + mattip
messages: + msg358806
2019-04-21 07:51:01skrahsetmessages: + msg340603
2019-04-14 20:38:47terry.reedysetnosy: + skrah, terry.reedy
messages: + msg340234
2019-04-14 18:31:05terry.reedysetversions: - Python 3.6
2018-07-23 14:19:25Eric.Wiesersetmessages: + msg322225
2018-07-16 10:30:42aerojockeysetmessages: + msg321730
2018-07-08 18:06:20belopolskysetassignee: belopolsky
2018-07-08 18:04:40belopolskysetnosy: + teoliphant, aerojockey
messages: + msg321280
2018-07-08 17:37:15Eric.Wiesersetmessages: + msg321279
2018-02-10 03:39:11terry.reedysetnosy: + amaury.forgeotdarc, belopolsky, meador.inge

versions: - Python 3.4, Python 3.5
2018-02-06 05:40:05Eric.Wiesersettype: behavior
2018-02-06 05:33:47Eric.Wiesersetkeywords: + patch
stage: patch review
pull_requests: + pull_request5383
2018-02-06 05:27:59Eric.Wiesercreate