This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: memoryview bind (the opposite of release)
Type: enhancement Stage:
Components: Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: mpb, pitrou, vstinner
Priority: normal Keywords:

Created on 2013-11-14 00:39 by mpb, last changed 2022-04-11 14:57 by admin.

Messages (7)
msg202806 - (view) Author: mpb (mpb) Date: 2013-11-14 00:39
I'm writing Python code to parse binary (byte oriented) data.

I am (at least somewhat) aware of the performance implications of various approaches to doing the parsing.  With performance in mind, I would like to avoid unnecessary creation/destruction/copying of memory/objects.

An example:

Let's say I am parsing b'0123456789'.
I want to extract and return the substring b'234'.

Now let's say I do this with memoryviews, to avoid unnecessary creation and copying of memory.

m0 = memoryview (b'0123456789')
m1 = m0[2:5]    # m1 == b'234'

Let's say I do this 1000 times.  Each time I use readinto to load the next data into m0.  So I can create m0 only once and reuse it.

But if the relative position of m1 inside m0 changes with each parse, then I need to create a new m1 for each parse.

In the context of the above example, I think it might be nice if I could rebind an existing memoryview to a new object.  For example:

m0 = memoryview (b'0123456789')
m1.bind (m0, 2, 5)    # m1 == b'234'

Is this an idea worth considering?

(Possibly related: Issue 9789, 9757, 3506; PEP 3118)
msg202816 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-14 01:41
> In the context of the above example, I think it might be nice if I
> could rebind an existing memoryview to a new object.

It would be nice how so? Can you try to estimate the speed gain?
msg202819 - (view) Author: mpb (mpb) Date: 2013-11-14 06:27
It would be nice in terms of avoiding malloc()s and free()s.

I could estimate it in terms of memoryview creations per message parse.  I'll be creating 10-20 memoryviews to parse each ~100 byte message.

So... I guess I'd have to build a test to see how long a memoryview creation/free takes.  And then perhaps compare it with variable to variable assignment instead.

If Python pools and recycles unused object by type (the way Lisp recycles cons cells) without free()ing them back to the heap, then there would be minimal speed improvement from my suggestion.  I don't know how CPython works internally, however.
msg202831 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-11-14 12:02
-1 on complicating the code further. It would be possible to pass
an existing memoryview to mbuf_add_view(). That would save the line

   mv = memory_alloc().

But:

  a) You need to check that ndim is correct (shape, strides and
     suboffsets are allocated via the struct hack).

  b) You need to check for existing exports of the memoryview.


  c) ... probably other things that would surface on closer examination.
msg202834 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-11-14 12:12
You could experiment with multiple freelists, one for each ndim.
I'm skeptical however that the gain will be substantial. I've tried
freelists for _decimal and the gain was in the order of 2% (not worth
it).
msg202836 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-14 12:13
Yes, I also doubt this would actually bring anything significant, which is why I asked for numbers :-) Python creates many objects in a very intensive fashion, and memoryview are not, by far, the most common object type.
msg202951 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2013-11-15 14:00
To my surprise, this line is 10% faster with a freelist:

./python -m timeit -s "import array; a = array.array('B', [0]*100); m = memoryview(a)" "m[30:40]"


I think the reason is that PyObject_GC_NewVar() is quite slow.
History
Date User Action Args
2022-04-11 14:57:53adminsetgithub: 63776
2014-10-14 17:20:49skrahsetnosy: - skrah
2013-11-15 14:00:31skrahsetmessages: + msg202951
2013-11-14 13:36:17vstinnersetnosy: + vstinner
2013-11-14 12:13:38pitrousetmessages: + msg202836
2013-11-14 12:12:08skrahsetmessages: + msg202834
2013-11-14 12:02:23skrahsetnosy: + skrah
messages: + msg202831
2013-11-14 06:27:08mpbsetmessages: + msg202819
2013-11-14 01:41:39pitrousetnosy: + pitrou
messages: + msg202816
2013-11-14 00:39:17mpbcreate