msg81823 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2009-02-12 21:39 |
Memoryview objects provide a structured view over a memory area, meaning
the length, indexing and slicing operations respect the itemsize:
>>> import array
>>> a = array.array('i', [1,2,3])
>>> m = memoryview(a)
>>> len(a)
3
>>> m.itemsize
4
>>> m.format
'i'
However, in some cases, you want the memoryview to behave as a chunk of
pure bytes regardless of the original object *and without making a
copy*. Therefore, it would be handy to be able to change the format of
the memoryview, or ask for a new memoryview with another format.
An example of use could be:
>>> a = array.array('i', [1,2,3])
>>> m = memoryview(a).with_format('B')
>>> len(a), m.itemsize, m.format
(12, 1, 'B')
|
msg81824 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2009-02-12 21:47 |
(Another way to see it is as supplying a Python equivalent to the C
buffer API, with access to the raw Py_buffer)
|
msg81839 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2009-02-12 23:53 |
Agreed, this would be useful. See http://codereview.appspot.com/12470/show if anyone doesn't believe us.
;)
|
msg128486 - (view) |
Author: Xuanji Li (xuanji) * |
Date: 2011-02-13 12:09 |
Is this issue from 2 years ago still open? I checked the docs and it seems to be.
If it is, I would like to work on a patch and submit it soon.
|
msg128488 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2011-02-13 13:07 |
It is, but keep issue 10181 in mind (since that may lead to some restructuring of the memoryview code, potentially leading to a need to update your patch).
|
msg135600 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-05-09 15:32 |
In the mean time I had to resort to dirty hacks in 1ac03e071d65 (such as using io.BytesIO.write(), which I know is implemented in C and doesn't care about item size).
At the minimum, a memoryview.getflatview() function would be nice (and probably easier to code than the generic version). Or a "flat" optional argument in the memoryview constructor.
|
msg135601 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2011-05-09 15:35 |
Read a int32 array as a raw byte string is useful, but the opposite is also useful.
|
msg135976 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2011-05-14 15:47 |
Unassigning. Sorry; no time for this at the moment.
|
msg142820 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-08-23 13:10 |
I think this would be useful and I'll try it out in features/pep-3118#memoryview.
Syntax options that I'd prefer:
a = array.array('i', [1,2,3])
m = memoryview(a, 'B')
Or go all the way and make memoryview take any flag:
a = array.array('i', [1,2,3])
m = memoryview(a, getbuf=PyBUF_SIMPLE)
This is what I currently do in _testbuffer.c:
>>> from _testbuffer import *
>>> import array
>>> a = array.array('i', [1,2,3])
>>> nd = ndarray(a, getbuf=PyBUF_SIMPLE)
>>> nd.format
''
>>> nd.len
12
>>> nd.shape
()
>>> nd.strides
()
>>> nd.itemsize # XXX array_getbuf should set this to 1.
4
We would need to fix various getbuffer() methods to adhere to
strict rules that I've posed here:
http://mail.scipy.org/pipermail/numpy-discussion/2011-August/058189.html
|
msg142821 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-23 13:24 |
> Or go all the way and make memoryview take any flag:
>
> a = array.array('i', [1,2,3])
> m = memoryview(a, getbuf=PyBUF_SIMPLE)
This is good for testing, but Python developers shouldn't have to know
about the low-level flags.
|
msg142826 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-08-23 13:51 |
Antoine Pitrou <report@bugs.python.org> wrote:
> > Or go all the way and make memoryview take any flag:
> >
> > a = array.array('i', [1,2,3])
> > m = memoryview(a, getbuf=PyBUF_SIMPLE)
>
> This is good for testing, but Python developers shouldn't have to know
> about the low-level flags.
Hmm, indeed. How about:
1) memoryview(a, format='B')
Shadows a builtin function; annoying syntax highlighting in current Vim.
2) memoryview(a, fmt='B')
I'm fully expecting a comment about 'strpbrk' again, but I like it. :)
Also, we've to see about speed implications. My current version of memoryview
(not pushed yet to the public repo) also solves #10227, but is pretty sensitive
even to small changes.
|
msg142828 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-23 14:06 |
> Hmm, indeed. How about:
>
> 1) memoryview(a, format='B')
>
> Shadows a builtin function; annoying syntax highlighting in current Vim.
>
> 2) memoryview(a, fmt='B')
>
> I'm fully expecting a comment about 'strpbrk' again, but I like it. :)
I really prefer "format", it's the natural word to use there.
I don't think this the only place where we shadow a builtin function.
There are probably variables named "dict" in many places.
> Also, we've to see about speed implications. My current version of memoryview
> (not pushed yet to the public repo) also solves #10227, but is pretty sensitive
> even to small changes.
Well, solving #10227 would be nice, but I don't think it's critical
either.
|
msg142830 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-08-23 14:28 |
Good, I'll use 'format'. I was mainly worried about the shadowing
issue.
|
msg142832 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-08-23 15:15 |
Rethinking a bit: Casting to arbitrary formats might go a bit far.
Currently, the combination (format=NULL, shape=NULL) can serve as
a warning "This buffer has been cast to unsigned bytes".
If we allow casts from bytes to int32, we'll have (format="i", shape=x)
and consumers of that buffer have no indication that the original
exporter had a different format.
If you know what you are doing, fine. On the other hand following
the buffer paths in #12817 quickly turned into a very complex
maze of getbuffer requests.
So, an option would be to try out the cast to bytes first and
disallow other casts.
|
msg142833 - (view) |
Author: Alyssa Coghlan (ncoghlan) * |
Date: 2011-08-23 15:22 |
Casting to a flat 1-D array of bytes is reasonable (it's essentially saying 'look, just give me the raw data, it's on my own head if I stuff up the formatting').
However, requiring an explicit two-step process for any other casting (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also sounds reasonable.
So I agree with Victor that 1-D bytes -> any shape/format and any shape/format -> 1-D bytes should be allowed, but I think we should hold off on allowing arbitrary transformations in a single step.
|
msg142834 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2011-08-23 15:31 |
> However, requiring an explicit two-step process for any other casting
> (i.e. take a 1-D view, then a shaped view of that flat 1-D view) also
> sounds reasonable.
>
> So I agree with Victor that 1-D bytes -> any shape/format and any
> shape/format -> 1-D bytes should be allowed, but I think we should
> hold off on allowing arbitrary transformations in a single step.
Converting to 1-D bytes is my main motivation for this feature request,
so I'm fine with such a limitation.
The point is to be able to do in Python what we can do in C, take an
arbitrary buffer and handle it as pure bytes (for I/O or cryptography
purposes, for example).
|
msg142842 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-08-23 16:27 |
Nick Coghlan <report@bugs.python.org> wrote:
> So I agree with Victor that 1-D bytes -> any shape/format and any
> shape/format -> 1-D bytes should be allowed, but I think we should
> hold off on allowing arbitrary transformations in a single step.
1-D bytes -> any shape/format would work if everyone agrees on the
Numpy mailing list post that I linked to in an earlier message.
[Summary: PyBUF_SIMPLE may downcast any C-contiguous array to unsigned bytes.]
Otherwise a PyBUF_SIMPLE getbuffer call to the newly shaped memoryview
would be required to fail, and these calls are almost certain to occur
somewhere, e.g. in PyObject_AsWriteBuffer().
But then memoryview would also need a 'shape' parameter:
m = memoryview(x, format='L', shape=[3, 4])
In that case, making it a method might indeed be more clear to underline
that something extraordinary is going on:
m = memoryview(x).cast(format='L', shape=[3, 4])
It also takes away a potential speed loss for regular uses.
1-D bytes would then be defined as 'b', 'B' and 'c', I presume? Being able
to cast to 'c' would also solve certain memoryview index assignment problems
that arise if we opt for strict typing as the struct module does.
|
msg143729 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2011-09-08 14:56 |
The cast method is completely implemented over at #10181.
|
msg152256 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-01-29 20:13 |
Shouldn't this be closed in favour of #10181?
|
msg152259 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2012-01-29 20:59 |
Yes, it's really superseded by #10181 now. I'm closing as 'duplicate',
since technically it'll be fixed once the patch for #10181 is committed.
|
msg154238 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-02-25 11:25 |
New changeset 3f9b3b6f7ff0 by Stefan Krah in branch 'default':
- Issue #10181: New memoryview implementation fixes multiple ownership
http://hg.python.org/cpython/rev/3f9b3b6f7ff0
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:45 | admin | set | github: 49481 |
2012-02-25 11:25:29 | python-dev | set | nosy:
+ python-dev messages:
+ msg154238
|
2012-01-29 20:59:16 | skrah | set | status: open -> closed superseder: Problems with Py_buffer management in memoryobject.c (and elsewhere?) messages:
+ msg152259
dependencies:
- Problems with Py_buffer management in memoryobject.c (and elsewhere?) resolution: duplicate stage: needs patch -> resolved |
2012-01-29 20:13:12 | pitrou | set | messages:
+ msg152256 |
2011-09-08 14:56:21 | skrah | set | dependencies:
+ Problems with Py_buffer management in memoryobject.c (and elsewhere?) messages:
+ msg143729 |
2011-08-23 16:27:03 | skrah | set | messages:
+ msg142842 |
2011-08-23 15:31:53 | pitrou | set | messages:
+ msg142834 |
2011-08-23 15:22:39 | ncoghlan | set | messages:
+ msg142833 |
2011-08-23 15:15:01 | skrah | set | messages:
+ msg142832 |
2011-08-23 14:28:06 | skrah | set | messages:
+ msg142830 |
2011-08-23 14:06:34 | pitrou | set | messages:
+ msg142828 |
2011-08-23 13:51:58 | skrah | set | messages:
+ msg142826 |
2011-08-23 13:24:17 | pitrou | set | messages:
+ msg142821 |
2011-08-23 13:10:40 | skrah | set | nosy:
+ skrah messages:
+ msg142820
|
2011-06-20 18:35:46 | jcon | set | nosy:
+ jcon
|
2011-05-14 15:47:51 | mark.dickinson | set | messages:
+ msg135976 |
2011-05-14 15:47:14 | mark.dickinson | set | assignee: mark.dickinson -> |
2011-05-09 15:35:32 | vstinner | set | nosy:
+ vstinner messages:
+ msg135601
|
2011-05-09 15:32:23 | pitrou | set | stage: patch review -> needs patch |
2011-05-09 15:32:17 | pitrou | set | stage: test needed -> patch review messages:
+ msg135600 versions:
+ Python 3.3, - Python 3.2 |
2011-02-13 13:07:51 | ncoghlan | set | nosy:
gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanji messages:
+ msg128488 |
2011-02-13 12:09:25 | xuanji | set | nosy:
gregory.p.smith, teoliphant, mark.dickinson, ncoghlan, pitrou, xuanji messages:
+ msg128486 |
2011-02-13 11:53:28 | xuanji | set | nosy:
+ xuanji
|
2011-01-04 01:44:06 | pitrou | set | assignee: mark.dickinson
nosy:
+ mark.dickinson |
2010-08-09 03:19:09 | terry.reedy | set | stage: test needed versions:
+ Python 3.2, - Python 3.1 |
2009-02-12 23:53:36 | gregory.p.smith | set | messages:
+ msg81839 |
2009-02-12 21:47:20 | pitrou | set | messages:
+ msg81824 |
2009-02-12 21:39:02 | pitrou | create | |