Issue31165
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2017-08-10 01:43 by geeknik, last changed 2022-04-11 14:58 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 3911 | closed | vstinner, 2017-10-06 20:09 | |
PR 3915 | open | serhiy.storchaka, 2017-10-07 14:50 |
Messages (13) | |||
---|---|---|---|
msg300033 - (view) | Author: geeknik (geeknik) | Date: 2017-08-10 01:42 | |
Python 3.7 git commit 3ca9f50 compiled with afl-clang-fast on Ubuntu 16 x64. The following script triggers undefined-behavior followed by a null pointer dereference and a segfault. import weakref class A(object):pass def callback(x):del lst[0] keepali0e=[] for i in range(1): lst=[str()] a=A() a.c=a keepali0e.append(weakref.ref(a,callback)) del a while lst:keepali0e.append(lst[:]) Objects/dictobject.c:547:12: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:547:12 in Objects/dictobject.c:1105:18: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1105:18 in Objects/dictobject.c:2739:15: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:2739:15 in Objects/dictobject.c:789:27: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:789:27 in Objects/dictobject.c:1104:18: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1104:18 in Objects/dictobject.c:994:15: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:994:15 in Objects/dictobject.c:683:11: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:683:11 in Objects/dictobject.c:1024:9: runtime error: index 64 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1024:9 in Objects/dictobject.c:2882:31: runtime error: index 64 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:2882:31 in Objects/dictobject.c:2346:15: runtime error: index 128 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:2346:15 in Objects/dictobject.c:1449:11: runtime error: index 32 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1449:11 in Objects/dictobject.c:744:27: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:744:27 in Objects/dictobject.c:1631:22: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1631:22 in Objects/dictobject.c:554:31: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:554:31 in Objects/dictobject.c:1183:15: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:1183:15 in Objects/dictobject.c:835:27: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:835:27 in Objects/dictobject.c:2036:10: runtime error: index 128 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:2036:10 in Objects/dictobject.c:3504:38: runtime error: index 16 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:3504:38 in Objects/dictobject.c:3422:38: runtime error: index 64 out of bounds for type 'int8_t [8]' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/dictobject.c:3422:38 in Objects/listobject.c:455:23: runtime error: load of null pointer of type 'PyObject *' (aka 'struct _object *') SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Objects/listobject.c:455:23 in ASAN:DEADLYSIGNAL ================================================================= ==29900==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x0000007772df bp 0x7fffdd00ce30 sp 0x7fffdd00cde0 T0) ==29900==The signal is caused by a READ memory access. ==29900==Hint: address points to the zero page. #0 0x7772de in list_slice /root/cpython/Objects/listobject.c:455:23 #1 0x79257b in list_subscript /root/cpython/Objects/listobject.c:2499:20 #2 0xca195c in _PyEval_EvalFrameDefault /root/cpython/Python/ceval.c:1442:29 #3 0xcc723c in _PyEval_EvalCodeWithName /root/cpython/Python/ceval.c:4173:14 #4 0xc679f3 in PyEval_EvalCodeEx /root/cpython/Python/ceval.c:4200:12 #5 0xc679f3 in PyEval_EvalCode /root/cpython/Python/ceval.c:657 #6 0x53056e in run_mod /root/cpython/Python/pythonrun.c:982:9 #7 0x531d77 in PyRun_FileExFlags /root/cpython/Python/pythonrun.c:935:11 #8 0x52d219 in PyRun_SimpleFileExFlags /root/cpython/Python/pythonrun.c:398:13 #9 0x5a958e in run_file /root/cpython/Modules/main.c:341:11 #10 0x5a958e in Py_Main /root/cpython/Modules/main.c:901 #11 0x500382 in main /root/cpython/./Programs/python.c:102:11 #12 0x7f17562f83f0 in __libc_start_main /build/glibc-mXZSwJ/glibc-2.24/csu/../csu/libc-start.c:291 #13 0x433e49 in _start (/root/cpython/python+0x433e49) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /root/cpython/Objects/listobject.c:455:23 in list_slice ==29900==ABORTING |
|||
msg303811 - (view) | Author: Oren Milman (Oren Milman) * | Date: 2017-10-06 09:49 | |
Here is some similar code that crashes for the same reasons: # create a circular reference with a malicious __del__(). class A: def __del__(*args): del list1[0] circ_ref_obj = A() circ_ref_obj._self = circ_ref_obj list1 = [None] list2 = [] del circ_ref_obj while len(list2) < 10000: list2.append(list1[:]) IIUC, list_slice() first finds the boundaries of the slice and its length, and then calls PyList_New(). But PyList_New() might call PyObject_GC_New(), which eventually causes a call to _PyObject_GC_Alloc(), which might call collect_generations(), which causes the malicious __del__() to run. After __del__() empties the list, list_slice() continues to run, but the list's boundaries it found earlier are now invalid, and so it tries to read the first element in the now empty list, and crashes. Maybe we should prevent collection of garbage with circular references (that has __del__() or weakref callbacks) from PyObject_GC_New()? ISTM there might be a lot of places with similar issues. (e.g. if we replace 'list2.append(list1[:])' with 'list2.append(list1[::-1])', we get a crash in list_subscript()). So i think that fixing each of them would be harder and might even introduce a regression in performance. |
|||
msg303813 - (view) | Author: Oren Milman (Oren Milman) * | Date: 2017-10-06 09:58 | |
Oh, and calls to PyObject_GC_NewVar() might also cause similar issues. |
|||
msg303830 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-10-06 15:53 | |
Oh, you are right Oren. Seems this is the only solution. |
|||
msg303849 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-10-06 20:10 | |
> Oh, you are right Oren. Seems this is the only solution. There are other solutions. I wrote PR 3911 which checks if the list size changed after PyList_New(). If it's the case, a RuntimeError exception is raised. We implemented similar checks in the dict type, if the dict is mutated during iterating on it. |
|||
msg303850 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-10-06 20:12 | |
"Maybe we should prevent collection of garbage with circular references (that has __del__() or weakref callbacks) from PyObject_GC_New()?" That would be a major change in the garbage collector. I would prefer to not touch the GC, any change can introduce a complex regression. Running the GC when an object is allocated makes sense to me. It's to reclaim memory, and prevent a MemoryError which would only be caused by "missed GC". |
|||
msg303856 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-10-06 20:43 | |
This is different case than mutating a dict while iterate it. In this case the failure is caused by GC, and it is always hard to handle such issues. In case of a dict you can just copy it before iterating. But what to do with RuntimeError from slicing a list if copying a list is vulnerable to the same issue? The example of yet one solution you can see in dict_keys() in dictobject.c. I always wondered why this code is needed. |
|||
msg303857 - (view) | Author: STINNER Victor (vstinner) * | Date: 2017-10-06 20:46 | |
> This is different case than mutating a dict while iterate it. In this case the failure is caused by GC, and it is always hard to handle such issues. In case of a dict you can just copy it before iterating. But what to do with RuntimeError from slicing a list if copying a list is vulnerable to the same issue? This issue was found by fuzzing with weird destructor. I don't think that anyone will hit the bug with "normal" code in the wild. Otherwise, we would have get a bug report before this one. |
|||
msg303861 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-10-06 21:03 | |
Or the bug is so hard for reproducing, that nobody had enough information for reporting. |
|||
msg303881 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-10-07 15:02 | |
See a4dd011259fa6f3079bd0efd95b3a136c0e3c190. The commit message: Tentative fix for a problem that Tim discovered at the last moment, and reported to python-dev: because we were calling dict_resize() in PyDict_Next(), and because GC's dict_traverse() uses PyDict_Next(), and because PyTuple_New() can cause GC, and because dict_items() calls PyTuple_New(), it was possible for dict_items() to have the dict resized right under its nose. The solution is convoluted, and touches several places: keys(), values(), items(), popitem(), PyDict_Next(), and PyDict_SetItem(). There are two parts to it. First, we no longer call dict_resize() in PyDict_Next(), which seems to solve the immediate problem. But then PyDict_SetItem() must have a different policy about when *it* calls dict_resize(), because we want to guarantee (e.g. for an algorithm that Jeremy uses in the compiler) that you can loop over a dict using PyDict_Next() and make changes to the dict as long as those changes are only value replacements for existing keys using PyDict_SetItem(). This is done by resizing *after* the insertion instead of before, and by remembering the size before we insert the item, and if the size is still the same, we don't bother to even check if we might need to resize. An additional detail is that if the dict starts out empty, we must still resize it before the insertion. That was the first part. :-) The second part is to make keys(), values(), items(), and popitem() safe against side effects on the dict caused by allocations, under the assumption that if the GC can cause arbitrary Python code to run, it can cause other threads to run, and it's not inconceivable that our dict could be resized -- it would be insane to write code that relies on this, but not all code is sane. Now, I have this nagging feeling that the loops in lookdict probably are blissfully assuming that doing a simple key comparison does not change the dict's size. This is not necessarily true (the keys could be class instances after all). But that's a battle for another day. We have the same issue with lists. PR 3915 tries to fix it by applying the same solution -- calling PyList_New() again if the source container was resized. list_slice() no longer can be considered safe, because it uses the size calculated before calling PyList_New(). Added _PyList_Copy() for copying the list for replacing unsafe PyList_GetSlice(). PyList_SetSlice() is not safe too (the PR still not fixes this). The code that uses the combination of PyList_GetSlice() and PyList_SetSlice() for safety (like in _asynciomodule.c) is not safe. Many code, including most implementations of slicing, should be rewritten if we go this way. PR 3915 shows only small example of such changes. I think than changing the Garbage Collector would be easier. |
|||
msg303882 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2017-10-07 15:26 | |
> I think than changing the Garbage Collector would be easier. What does "changing" mean exactly? What will be the effects on normal code? How do you know that it will not create new problems that didn't exist before? |
|||
msg303890 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2017-10-07 18:18 | |
> What does "changing" mean exactly? I'm not a GC expert. Maybe we should add a global flag that disable calling nontrivial destructors and set it in PyObject_GC_New(). The objects with nontrivial destructor should be added to the special queue and destroyed in "safe" place. There may be a problem with determining the "safe" place. I think that any Py_DECREF() is a "safe" place, and the eval loop is a "safe" place. > What will be the effects on normal code? I think this shouldn't affect Python code if perform deferred destroying in the eval loop. This can affect the execution of extensions if reference loops are created in C code. Some objects in loops can live longer that before. > How do you know that it will not create new problems that didn't exist before? I don't know. But we can try and see what is the result. |
|||
msg404244 - (view) | Author: Irit Katriel (iritkatriel) * | Date: 2021-10-18 23:16 | |
Reproduced on 3.11. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:49 | admin | set | github: 75348 |
2021-10-18 23:16:49 | iritkatriel | set | nosy:
+ iritkatriel title: list_slice() does crash if the list is mutated indirectly by PyList_New() -> list_slice() crashes if the list is mutated indirectly by PyList_New() messages: + msg404244 versions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.7 |
2017-10-07 18:18:17 | serhiy.storchaka | set | messages: + msg303890 |
2017-10-07 15:26:32 | pitrou | set | messages: + msg303882 |
2017-10-07 15:02:02 | serhiy.storchaka | set | messages: + msg303881 |
2017-10-07 14:50:31 | serhiy.storchaka | set | pull_requests: + pull_request3888 |
2017-10-06 21:03:11 | serhiy.storchaka | set | messages: + msg303861 |
2017-10-06 20:46:00 | vstinner | set | messages: + msg303857 |
2017-10-06 20:43:42 | serhiy.storchaka | set | messages: + msg303856 |
2017-10-06 20:13:27 | vstinner | set | title: null pointer deref and segfault in list_slice (listobject.c:455) -> list_slice() does crash if the list is mutated indirectly by PyList_New() |
2017-10-06 20:12:48 | vstinner | set | messages: + msg303850 |
2017-10-06 20:10:33 | vstinner | set | messages: + msg303849 |
2017-10-06 20:09:19 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request3884 |
2017-10-06 15:53:05 | serhiy.storchaka | set | nosy:
+ twouters, pitrou, tim.peters, vstinner, serhiy.storchaka, lemburg messages: + msg303830 |
2017-10-06 09:58:50 | Oren Milman | set | messages: + msg303813 |
2017-10-06 09:49:19 | Oren Milman | set | nosy:
+ Oren Milman messages: + msg303811 |
2017-08-10 02:02:53 | geeknik | set | type: crash |
2017-08-10 01:43:02 | geeknik | create |