classification
Title: PyMemoryView_FromObject alters the Py_buffer after calling PyObject_GetBuffer when ndim 1
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Problems with Py_buffer management in memoryobject.c (and elsewhere?)
View: 10181
Assigned To: Nosy List: kermode, mark.dickinson, ncoghlan, pitrou, python-dev, skrah, teoliphant
Priority: normal Keywords: patch

Created on 2010-09-29 19:03 by kermode, last changed 2012-03-02 07:48 by skrah. This issue is now closed.

Files
File name Uploaded Description Edit
bufrel.c.gz kermode, 2010-09-29 19:03 Embedded Python 3.2 program that calls PyMemoryVeiw_FromObject
memview.patch pitrou, 2010-10-01 18:57
Messages (16)
msg117644 - (view) Author: Lenard Lindstrom (kermode) Date: 2010-09-29 19:03
If an exporter returns a Py_buffer with ndim 1, PyMemoryView_FromObject changes the shape and strides pointer fields to point to a local Py_buffer array field. This array field is undocumented. Any heap memory these pointers reference is lost. Should the exporter's bf_releasebuffer later try and free the memory, the Python interpreter may segfault.

Attached is a demonstration program. Its output is:

Accessing buffer directly...
Accessing buffer through a memory view...
* View->shape has changed.
Done.

where the third line shows bf_releasebuffer has detected a changed pointer.
msg117826 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-01 18:53
Here is a patch that fixes the issue. Can you try it?

Unfortunately, more advanced uses such a slicing the memoryview are still crashing. That's because the new buffer protocol doesn't define ownership of Py_buffer structs. As a result, nothing can be assumed at to which piece of code is responsible for allocation and deallocation of related memory areas (such as shapes and strides arrays).
msg117827 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-01 18:57
Of course, a patch is always better without the debugging prints :)
msg117855 - (view) Author: Lenard Lindstrom (kermode) Date: 2010-10-02 00:12
Applied patch to:
Python 3.2a2+ (py3k:85150M, Oct  1 2010, 14:40:33) 
[GCC 4.4.5 20100728 (prerelease)] on linux2

Python unit test test_capi.py crashes:

internal test_broken_memoryview
* ob
object  : <refcnt 0 at 0xb7171178>
type    : str
refcount: 0
address : 0xb7171178
* op->_ob_prev->_ob_next
object  : <refcnt 0 at 0xb7171178>
type    : str
refcount: 0
address : 0xb7171178
* op->_ob_next->_ob_prev
object  : Segmentation fault

Pygame unit tests pass (segfaults without the patch).

bufrel.c test passes.

numpy 1.5.0 unit tests not run since they rely on a package that needs porting to Python 3.x. A memory view is used to manage an object whose buffer a numpy array exposes. This was where the Pygame unit test seqfault occurred.

The patch fixes the problem with Pygame.

Thanks.
msg124167 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-12-17 00:19
> That's because the new buffer protocol doesn't define ownership of
> Py_buffer structs. As a result, nothing can be assumed at to which
> piece of code is responsible for allocation and deallocation of
> related memory areas (such as shapes and strides arrays).

I was just chatting to Travis about this;  he suggested that the 'internal' field of the Py_buffer struct might be used to record who's responsible for deallocation of shape and stride arrays.
msg124182 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-17 03:56
The PEP is quite clear that the object providing the buffer owns those fields. Last time I checked, the memoryview implementation appeared broken because it didn't respect that ownership (and created the potential for confusion on the topic).
msg124286 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-12-18 15:17
I'll take a look at this.
msg124502 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-12-22 13:04
Antoine, a couple of questions:

(1) Is there documentation for the 'smalltable' field of the Py_buffer struct anywhere?  What are the requirements for the exporter here?  E.g., is it / should it be a requirement that shape, strides and suboffsets are all NULL whenever 'smalltable' is used?

(2) Same question for the 'obj' field:  what's the purpose of this field, from the POV of 3rd party buffer exporters?
msg124504 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-22 13:11
> (1) Is there documentation for the 'smalltable' field of the Py_buffer
> struct anywhere?  What are the requirements for the exporter here?

No, and no particular requirements AFAIR. It is a piece of internal
storage aimed at avoiding the nagging allocation and ownership problem
(but only so when this storage is large enough, of course).

> E.g., is it / should it be a requirement that shape, strides and
> suboffsets are all NULL whenever 'smalltable' is used?

Not at all. On the contrary, smalltable can be used as a piece of
storage for some of these fields.

> (2) Same question for the 'obj' field:  what's the purpose of this
> field, from the POV of 3rd party buffer exporters?

None, it's used by the consumer side of the API, to know which object
exported the buffer and to keep a reference to it so that it doesn't get
deallocated early.
msg125636 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-07 09:39
As per the discussion over in issue #10181, I've changed my position on this issue.

Since the PEP isn't explicit on the exact semantics here, I think we should be guided by the memoryview behaviour and make it official that bf_releasebuffer implementations shouldn't trust the contents of the passed in Py_buffer struct. Instead, if that information is important, the exporter should create an internal data structure that preserves the necessary detail and merely use the Py_buffer address as an identifier to locate the internal copy.
msg125638 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-01-07 10:01
The alternative (if we declare that clients should treat Py_buffer contents as completely read-only) is to update memoryview to include a separate "orig_view" field that would never be touched. The GetBuffer and ReleaseBuffer calls would then use orig_view, with dup_buffer used to copy the data into the main view field before modifying it.

However, this approach greatly complicates the bf_getbuffer and bf_releasebuffer implementations, since memoryview could no longer pass the supplied Py_buffer pointer straight through to the underlying implementation.

Instead, for each call to bf_getbuffer, it would need to create a new Py_buffer struct, use that for the GetBuffer call to the underlying object, copy the contents over to the passed in buffer (modifying the shape information as appropriate for any slicing that is in effect), then, in bf_releasebuffer, use the passed in pointer to find the right Py_buffer struct to use in the ReleaseBuffer call.

Putting the onus on exporters to be suspicious of the contents of the Py_buffer objects handed to bf_releasebuffer implementations actually seems like the more robust approach in the long run.
msg128163 - (view) Author: Lenard Lindstrom (kermode) Date: 2011-02-08 00:15
I think only a simple solution is needed. From my experience adding the 
new buffer protocol to pygame.mixer.Sound it would be easy enough for 
bf_releasebuffer to use the "internal" field to free memory allocated by 
bf_getbuffer. As long as this pointer is preserved it would not matter 
if Py_buffer is copied or the "shape" and "strides" pointers change. 
Just ensure Py_buffer is clearly documented.
msg141859 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-08-10 12:31
This should be fixed with the ManagedBuffer implementation from #10181.
msg152996 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-02-09 22:42
I ran the demo in the pep-3118 repo, and it is fixed (see #10181):

$ ./bufrel 
Accessing buffer directly...
Accessing buffer through a memory view...
Done.
msg154243 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-25 11:25
New changeset 3f9b3b6f7ff0 by Stefan Krah in branch 'default':
- Issue #10181: New memoryview implementation fixes multiple ownership
http://hg.python.org/cpython/rev/3f9b3b6f7ff0
msg154746 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-03-02 07:48
Since this issue targeted 2.7 and 3.2:

In a brief discussion on python-dev it was decided that the 3.3 fixes
from #10181 won't be backported for a number of reasons, see:

http://mail.python.org/pipermail/python-dev/2012-February/116872.html
History
Date User Action Args
2012-03-02 07:48:38skrahsetmessages: + msg154746
2012-02-25 11:25:31python-devsetnosy: + python-dev
messages: + msg154243
2012-02-09 22:42:30skrahsetstatus: open -> closed
superseder: Problems with Py_buffer management in memoryobject.c (and elsewhere?)
messages: + msg152996

dependencies: - Problems with Py_buffer management in memoryobject.c (and elsewhere?)
resolution: duplicate
stage: patch review -> resolved
2011-08-10 12:31:52skrahsetnosy: + skrah
dependencies: + Problems with Py_buffer management in memoryobject.c (and elsewhere?)
messages: + msg141859
2011-06-25 09:46:07mark.dickinsonsetassignee: mark.dickinson ->
2011-02-08 00:15:49kermodesetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg128163
2011-01-07 10:01:41ncoghlansetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg125638
2011-01-07 09:39:47ncoghlansetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg125636
2010-12-22 13:11:50pitrousetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg124504
2010-12-22 13:04:28mark.dickinsonsetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg124502
2010-12-18 15:17:22mark.dickinsonsetassignee: mark.dickinson
messages: + msg124286
nosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
2010-12-17 03:56:10ncoghlansetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg124182
2010-12-17 00:19:06mark.dickinsonsetnosy: teoliphant, mark.dickinson, ncoghlan, kermode, pitrou
messages: + msg124167
2010-12-15 20:36:43pitrousetassignee: pitrou -> (no value)

nosy: + mark.dickinson
2010-10-02 00:12:20kermodesetmessages: + msg117855
2010-10-01 18:57:26pitrousetfiles: - memview.patch
2010-10-01 18:57:19pitrousetfiles: + memview.patch

messages: + msg117827
2010-10-01 18:53:41pitrousetfiles: + memview.patch

nosy: + teoliphant, ncoghlan
messages: + msg117826

keywords: + patch
stage: needs patch -> patch review
2010-09-29 19:20:38pitrousetassignee: pitrou
stage: needs patch

nosy: + pitrou
versions: + Python 3.1, Python 2.7
2010-09-29 19:03:36kermodecreate