classification
Title: PyObjects initialized with PyObject_New have uninitialized pointers to set to 0x1
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, fim
Priority: normal Keywords:

Created on 2018-11-04 11:52 by fim, last changed 2018-11-04 18:24 by fim. This issue is now closed.

Messages (5)
msg329224 - (view) Author: Serafeim Mellos (fim) Date: 2018-11-04 11:52
I looked but I couldn't find any mention of this either in the bug tracker or the docs so I assume this is not a desired behavior or a known proble but feel free to correct me if I'm wrong.

I have noticed a weird behavior in Python3.7 onwards with the way PyObject_New() works. I'm not sure what is the exact cause for the issue but even though the documentation mentions that it should initialize all fields described in the associated PyTypeObject, I think there's a problem that somehow causes uninitialized pointers to get initialized with strange values that break other functionality (eg Py_XDECREF).

In more detail, it seems like uninitialized pointers in PyObjects get initialized to 0x1 which can lead to SEGFAULTs when calling Py_XDECREF() on them since they are no longer valid pointers.

I have taken the example extension from the python docs (https://docs.python.org/3/extending/newtypes_tutorial.html) and modified lightly in order to surface the issue. You can find the sample extension in my github: https://github.com/fim/python_pyobjectnew_example

I have checked versions 3.5, 3,6 which are fine but 3.7.0, 3.7.1 and 3.8 seem to reproduce this issue.

You can see a simple transcript of how the issue manifests below:

$ python setup.py build
[...]
$ pip install . 
[...]
$ python -c 'import custom; custom.Custom()'
weird pointer has value of: 0x8ec480
deallocating weird pointer
$ python
Python 3.7.1 (default, Nov  3 2018, 09:33:27) 
[GCC 5.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import custom, gc
>>> custom.Custom()
weird pointer has value of: 0x1
<custom2.Custom object at 0x7f1a2b8e4ed0>
>>> gc.collect()
deallocating weird pointer
zsh: segmentation fault  python
msg329229 - (view) Author: Serafeim Mellos (fim) Date: 2018-11-04 14:23
Backtrace from segfault is below but I'm not very familiar with python's memory internals in order to figure out where is the value initialized at the first place 

(gdb) bt
#0  _Py_XDECREF (op=0x1) at /home/fim/.pyenv/versions/3.8-dev/include/python3.8m/object.h:896
#1  Custom_dealloc (self=0x7ffff7ec0660) at custom.c:20
#2  0x00000000004685f2 in _Py_Dealloc_inline (op=<optimized out>) at ./Include/object.h:806
#3  _Py_DECREF (filename=0x5f7460 "Objects/dictobject.c", lineno=1508, op=<optimized out>) at ./Include/object.h:833
#4  delitem_common (old_value=<optimized out>, ix=<optimized out>, hash=-2939463505650448616, mp=0x7ffff7f38d80) at Objects/dictobject.c:1508
#5  _PyDict_DelItem_KnownHash (hash=-2939463505650448616, key=0x7ffff7f39b90, op=0x7ffff7f38d80) at Objects/dictobject.c:1560
#6  PyDict_DelItem (op=0x7ffff7f38d80, key=0x7ffff7f39b90) at Objects/dictobject.c:1526
#7  0x0000000000468aa5 in dict_ass_sub (mp=0x7ffff7f38d80, v=0x7ffff7f39b90, w=<optimized out>) at Objects/dictobject.c:2058
#8  0x00000000004262e3 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2044
#9  0x00000000004e3ac0 in PyEval_EvalFrameEx (throwflag=0, f=0x7ffff7f5a970) at Python/ceval.c:531
#10 _PyEval_EvalCodeWithName (_co=_co@entry=0x7ffff7e97e40, globals=globals@entry=0x7ffff7f38d80, locals=locals@entry=0x7ffff7f38d80, args=args@entry=0x0, argcount=argcount@entry=0, 
    kwnames=kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:3946
#11 0x00000000004e3bb3 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=locals@entry=0x7ffff7f38d80, 
    globals=globals@entry=0x7ffff7f38d80, _co=_co@entry=0x7ffff7e97e40) at Python/ceval.c:3975
#12 PyEval_EvalCode (co=co@entry=0x7ffff7e97e40, globals=globals@entry=0x7ffff7f38d80, locals=locals@entry=0x7ffff7f38d80) at Python/ceval.c:508
#13 0x000000000052009f in run_mod (arena=0x7ffff7fb81e0, flags=0x7fffffffd5e0, locals=0x7ffff7f38d80, globals=0x7ffff7f38d80, filename=0x7ffff7e510d8, mod=0x9a66c0)
    at Python/pythonrun.c:1031
#14 PyRun_InteractiveOneObjectEx (fp=fp@entry=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename=filename@entry=0x7ffff7e510d8, flags=flags@entry=0x7fffffffd5e0) at Python/pythonrun.c:256
#15 0x0000000000520386 in PyRun_InteractiveLoopFlags (fp=fp@entry=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename_str=filename_str@entry=0x5f13a5 "<stdin>", flags=flags@entry=0x7fffffffd5e0)
    at Python/pythonrun.c:120
#16 0x00000000005204fc in PyRun_AnyFileExFlags (fp=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename=0x5f13a5 "<stdin>", closeit=0, flags=0x7fffffffd5e0) at Python/pythonrun.c:78
#17 0x000000000042e0fe in pymain_run_stdin (pymain=0x7fffffffd6f0, pymain=0x7fffffffd6f0, cf=0x7fffffffd5e0, config=0x922ea8) at Modules/main.c:1185
#18 pymain_run_python (interp=0x922e20, pymain=0x7fffffffd6f0) at Modules/main.c:1613
#19 pymain_main (pymain=0x7fffffffd6f0) at Modules/main.c:1758
#20 0x000000000042e349 in _Py_UnixMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:1795
#21 0x00007ffff711b4db in __libc_start_main () from /lib64/libc.so.6
#22 0x000000000042aeaa in _start ()
msg329241 - (view) Author: Serafeim Mellos (fim) Date: 2018-11-04 17:27
Actually, I'm not so sure it's related to memory management after all judging by this (but I don't really know how the compilation part is supposed to work :)):


(gdb) watch *0x7ffff7ec0680
Hardware watchpoint 1: *0x7ffff7ec0680
(gdb) b Custom_dealloc
Breakpoint 2 at 0x7fffef9c4c50: file custom.c, line 16.
(gdb) run
[...]
Hardware watchpoint 1: *0x7ffff7ec0680

Old value = 0
New value = 1
dfs (c=c@entry=0x7fffffffd440, b=b@entry=0x7ffff7ec0660, a=a@entry=0x7fffffffd3b0, end=2) at Python/compile.c:4974
4974	        a->a_postorder[--j] = b;
(gdb) bt
#0  dfs (c=c@entry=0x7fffffffd440, b=b@entry=0x7ffff7ec0660, a=a@entry=0x7fffffffd3b0, end=2) at Python/compile.c:4974
#1  0x00000000004e9e86 in assemble (c=c@entry=0x7fffffffd440, addNone=<optimized out>) at Python/compile.c:5530
#2  0x00000000004f45c6 in compiler_mod (mod=0x9a86a8, c=0x7fffffffd440) at Python/compile.c:1641
#3  PyAST_CompileObject (mod=mod@entry=0x9a86a8, filename=filename@entry=0x7ffff7e510d8, flags=flags@entry=0x7fffffffd5e0, optimize=optimize@entry=-1, arena=arena@entry=0x7ffff7fb81e0)
    at Python/compile.c:345
#4  0x0000000000520025 in run_mod (arena=0x7ffff7fb81e0, flags=0x7fffffffd5e0, locals=0x7ffff7f38d80, globals=0x7ffff7f38d80, filename=0x7ffff7e510d8, mod=0x9a86a8)
    at Python/pythonrun.c:1028
#5  PyRun_InteractiveOneObjectEx (fp=fp@entry=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename=filename@entry=0x7ffff7e510d8, flags=flags@entry=0x7fffffffd5e0) at Python/pythonrun.c:256
#6  0x0000000000520326 in PyRun_InteractiveLoopFlags (fp=fp@entry=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename_str=filename_str@entry=0x5f1345 "<stdin>", flags=flags@entry=0x7fffffffd5e0)
    at Python/pythonrun.c:120
#7  0x000000000052049c in PyRun_AnyFileExFlags (fp=0x7ffff749b8a0 <_IO_2_1_stdin_>, filename=0x5f1345 "<stdin>", closeit=0, flags=0x7fffffffd5e0) at Python/pythonrun.c:78
#8  0x000000000042e0be in pymain_run_stdin (pymain=0x7fffffffd6f0, pymain=0x7fffffffd6f0, cf=0x7fffffffd5e0, config=0x922ea8) at Modules/main.c:1185
#9  pymain_run_python (interp=0x922e20, pymain=0x7fffffffd6f0) at Modules/main.c:1613
#10 pymain_main (pymain=0x7fffffffd6f0) at Modules/main.c:1758
#11 0x000000000042e309 in _Py_UnixMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:1795
#12 0x00007ffff711b4db in __libc_start_main () from /lib64/libc.so.6
#13 0x000000000042ae6a in _start ()
(gdb) frame
#0  dfs (c=c@entry=0x7fffffffd440, b=b@entry=0x7ffff7ec0660, a=a@entry=0x7fffffffd3b0, end=2) at Python/compile.c:4974
4974	        a->a_postorder[--j] = b;
(gdb) c
Continuing.
weird pointer has value of: 0x1
>>> 

Breakpoint 2, Custom_dealloc (self=0x7ffff7ec0660) at custom.c:16
16	{
(gdb) print(self->test_field)
$1 = (PyObject *) 0x1
(gdb) print(&self->test_field)
$2 = (PyObject **) 0x7ffff7ec0680
msg329244 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-11-04 17:42
PyObject_New is a low-level allocation function. It doesn't initialize anything but the type pointer and ref count. If you want a field to be NULL, set it to NULL.
msg329246 - (view) Author: Serafeim Mellos (fim) Date: 2018-11-04 18:24
Ok, I understand your point about PyObject_New not initializing custom fields (although it looks like something is initializing the memory since it's not random, it's either 0x0 or 0x1).

However, if that's the expected behavior, it's not very clear why there's a difference between the default values when invoking the same commands in different ways or between different python versions, eg:

Python3.8
=========

Python 3.8.0a0 (heads/master-dirty:f98c1623ec, Nov  4 2018, 17:16:57) 
[GCC 5.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import custom; custom.Custom()
weird pointer has value of: 0x0
<custom2.Custom object at 0x7f19fd456630>
>>> import custom;                
>>> custom.Custom()
weird pointer has value of: 0x1
deallocating weird pointer
<custom2.Custom object at 0x7f19fd4566f0>
>>> 
deallocating weird pointer
zsh: segmentation fault  ../cpython/python

Python 3.6
==========

Python 3.6.5 (default, Nov  4 2018, 18:20:36) 
[GCC 5.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import custom; custom.Custom()
weird pointer has value of: 0x0
deallocating weird pointer
<custom2.Custom object at 0x7f3d627d6b40>
>>> import custom  
>>> custom.Custom()
weird pointer has value of: 0x0
deallocating weird pointer
<custom2.Custom object at 0x7f3d627d6b70>


Is there any discussions or documentation I can refer to, in order to better understand this behavior or why it was changed since python 3.7? 

Thanks!
History
Date User Action Args
2018-11-04 18:24:22fimsetmessages: + msg329246
2018-11-04 17:42:18benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg329244

resolution: not a bug
stage: resolved
2018-11-04 17:27:16fimsetmessages: + msg329241
2018-11-04 14:23:27fimsetmessages: + msg329229
2018-11-04 11:52:05fimcreate