classification
Title: segfault due to null pointer in tuple
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: brett.cannon, eric.snow, larry, ncoghlan, python-dev, random832, rhettinger, serhiy.storchaka, vstinner, xiang.zhang
Priority: high Keywords: patch

Created on 2016-04-21 00:20 by random832, last changed 2017-04-24 18:46 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
fnt.py random832, 2016-04-21 00:20 Script to find the bad tuple and try to access it
property_descr_get_gc_untrack.patch vstinner, 2016-04-21 07:51 review
property_cached_args_set_size_0.patch serhiy.storchaka, 2016-04-21 08:00 review
property_descr_get_gc_untrack_2.patch serhiy.storchaka, 2016-04-23 09:35 review
Pull Requests
URL Status Linked Edit
PR 1272 open serhiy.storchaka, 2017-04-24 18:46
Messages (16)
msg263866 - (view) Author: (random832) Date: 2016-04-21 00:20
I was writing something that iterates over all objects in gc.get_objects(). I don't know precisely where the offending tuple is coming from, but nearby objects in gc.get_objects() appear to be related to frozen_importlib.

A tuple exists somewhere with a null pointer in it. Attempting to iterate over it or access its item results in a segmentation fault.

I confirmed this on both Linux and OSX.

Python 3.6.0a0 (default:778ccbe3cf74, Apr 20 2016, 20:17:38)
msg263882 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-04-21 05:51
hg bisect tells me changeset 95830:661cdbd617b8 introduces this behaviour.
msg263885 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-21 06:27
Also, see https://hg.python.org/cpython/rev/5dbf3d932a59

Serhiy, I'll thinking the whole optimization ought to be reverted as being too tricky and error-prone for the small benefit.  A less aggressive approach would be to have the tuple save a valid object (such as None) instead of NULL.  What do you think?
msg263886 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-21 06:27
661cdbd617b8 can cause problems with recursion and with cached getters. I suggest to revert it.
msg263887 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-21 06:32
I concur.  Please do the honors and revert it from 3.5 and 3.6.
msg263888 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-21 06:40
Ah, yes, 5dbf3d932a59 partially fixed this optimization. Saving None in a tuple (as I believe done in itertools/zip/enumerate optimization) should fix this issue. I agree that the code is too tricky (as well as the code in itertools/zip/enumerate). It would be better to make tuple creation faster instead of adding special caches in separate functions (see also issue23507). But for now this optimization makes attribute access in named tuples 30% faster, and this is good argument for keeping it.
msg263889 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-21 06:44
Sorry, I'm late with my comments. All my posts are written without seeing your previous message.
msg263892 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-21 07:51
Serhiy: "Ah, yes, 5dbf3d932a59 partially fixed this optimization."

Since we already talking about hacks, another hack is to hide this private tuple from the GC using _PyObject_GC_UNTRACK(): see attached property_descr_get_gc_untrack.patch.

Note: I dislike my own patch :-D


Serhiy: "661cdbd617b8 can cause problems with recursion and with cached getters. I suggest to revert it."

Even if I don't understand the issue, I concur that the micro-optimization must be reverted. A micro-optimization must not introduce any behaviour change nor segfault :-)


Raymond Hettinger: "Serhiy, I'll thinking the whole optimization ought to be reverted as being too tricky and error-prone for the small benefit.  A less aggressive approach would be to have the tuple save a valid object (such as None) instead of NULL.  What do you think?"

PyEval_EvalCodeEx() accepts a C array of PyObject* for function arguments, it's used by fast_function() in Pythn/ceval.c. Maybe a new C function can be designed to call a function: it would uses PyEval_EvalCodeEx() to call functions implemented in Python, or create the tuple for functions implemented in C.

It looks like the issue #23910 which introduced the optimization was created for functions implemented in C. So I don't know it's acceptable. But it would be safer and more generic, no?
msg263893 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-21 07:59
> Maybe a new C function can be designed to call a function: it would uses PyEval_EvalCodeEx() to call functions implemented in Python, or create the tuple for functions implemented in C.

I would expect C code to be more optimized that Python functions, but for the specific case of function calls: Python are more optimized with the fast-path fast_function()!

I recall vaguely that Larry Hasting shared me his idea of passing the Python stack to C functions to avoid the creation of a tuple.

Maybe we can add a new API for C functions accepted a C array of PyObject*, a number of positional arguments and a number of (key, value) pairs for keyword arguments. Something similar to PyEval_EvalCodeEx() but for C functions. It means that we should add a new flavor of PyArg_ParseTuple/PyArg_ParseTupleAndKeywords to accepts this format.

Larry's idea was to use the argument clinic to handle that for us.
msg263894 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-21 08:00
References:

Issue23910 -- added original optimization (661cdbd617b8).
Issue24276 -- it was significant rewritten (5dbf3d932a59).

My suggestion in msg263886 to revert the optimization was related to the original code. Now, after reminding the background, I think it is worth to keep the optimization and try to fix the code.

Making the cached tuple to save None adds additional complexity and overhead (about 5%). Here is a patch that uses different trick. The size of saved tuple is set to 0.
msg263895 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-21 08:06
I suggest to remove the micro-optimization from Python 3.5 for safety.

I'm ok to experiment a new safer implementation on Python 3.6 ;-) We have more time to fix the code in Python 3.6 if new issues are found. Setting the tuple size to zero looks simple and safe, but the overall hack deserves a comment to explain:

* why you use a cached tuple
* why the reference count can be different than 2: recursive calls
* why do you change the tuple size
msg263900 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-21 08:57
I created the issue #26814: "Add a new _PyObject_CallStack() function which avoids the creation of a tuple or dict for arguments".
msg263903 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-21 09:28
> I suggest to remove the micro-optimization from Python 3.5 for safety.

Then we should remove the promise from What's New In Python 3.5. That will cause a regression in namedtuple attributes access about 30%.
msg263904 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-04-21 09:41
Oh, I see "The property() getter calls are up to 25% faster.
(Contributed by Joe Jevnik in issue 23910.)" in
https://docs.python.org/dev/whatsnew/3.5.html#optimizations

Hum... This is embarrassing :-/ Ok, let's keep it, but we must fix it ;-)
msg264060 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-04-23 09:35
May be using _PyObject_GC_UNTRACK() is more correct than setting the size to 0 or setting the item to None. But property_descr_get_gc_untrack.patch makes the code a little slower. Here is optimized version of Victor's patch.
msg264846 - (view) Author: Roundup Robot (python-dev) Date: 2016-05-04 18:43
New changeset a98ef122d73d by Serhiy Storchaka in branch '3.5':
Issue #26811: gc.get_objects() no longer contains a broken tuple with NULL
https://hg.python.org/cpython/rev/a98ef122d73d

New changeset 3fe1c7ad3b58 by Serhiy Storchaka in branch 'default':
Issue #26811: gc.get_objects() no longer contains a broken tuple with NULL
https://hg.python.org/cpython/rev/3fe1c7ad3b58
History
Date User Action Args
2017-04-24 18:46:48serhiy.storchakasetpull_requests: + pull_request1384
2016-05-05 13:22:36serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-05-04 18:43:53python-devsetnosy: + python-dev
messages: + msg264846
2016-04-23 09:35:28serhiy.storchakasetfiles: + property_descr_get_gc_untrack_2.patch

messages: + msg264060
2016-04-21 09:41:19vstinnersetmessages: + msg263904
2016-04-21 09:28:27serhiy.storchakasetmessages: + msg263903
2016-04-21 08:57:45vstinnersetmessages: + msg263900
2016-04-21 08:06:22vstinnersetmessages: + msg263895
2016-04-21 08:00:22serhiy.storchakasetfiles: + property_cached_args_set_size_0.patch
type: crash
messages: + msg263894

stage: patch review
2016-04-21 07:59:08vstinnersetnosy: + larry
messages: + msg263893
2016-04-21 07:51:36vstinnersetfiles: + property_descr_get_gc_untrack.patch

nosy: + vstinner
messages: + msg263892

keywords: + patch
2016-04-21 06:44:02serhiy.storchakasetmessages: + msg263889
2016-04-21 06:40:24serhiy.storchakasetmessages: + msg263888
2016-04-21 06:32:25rhettingersetpriority: normal -> high
2016-04-21 06:32:14rhettingersetassignee: rhettinger -> serhiy.storchaka
messages: + msg263887
2016-04-21 06:27:45serhiy.storchakasetmessages: + msg263886
2016-04-21 06:27:14rhettingersetmessages: + msg263885
2016-04-21 06:04:00rhettingersetassignee: rhettinger

nosy: + rhettinger
2016-04-21 05:51:55xiang.zhangsettype: crash -> (no value)

messages: + msg263882
nosy: + xiang.zhang
2016-04-21 05:24:46serhiy.storchakasetnosy: + brett.cannon, ncoghlan, eric.snow, serhiy.storchaka
type: crash
2016-04-21 00:20:15random832create