This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Destructor of ElementTree.Element is recursive
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: eli.bendersky, python-dev, scoder, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016-12-04 23:03 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
etree-trashcan.patch serhiy.storchaka, 2016-12-14 18:47 review
bug.py vstinner, 2016-12-28 02:11
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (9)
msg282376 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-04 23:03
The Element class in the xml.etree.ElementTree module is a collection. It can contain other Element's. But unlike to common Python collections (list, dict, etc) and pure Python classes, C implementation of Element doesn't support unlimited recursion. As result, destroying very deep Element tree can cause stack overflow. Example:

import xml.etree.cElementTree as ElementTree
y = x = ElementTree.Element('x')
for i in range(200000): y = ElementTree.SubElement(y, 'x')

del x
msg283211 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-14 18:47
Proposed patch fixes the crash.
msg283739 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-12-21 10:56
New changeset 957091874ea0 by Serhiy Storchaka in branch '3.5':
Issue #28871: Fixed a crash when deallocate deep ElementTree.
https://hg.python.org/cpython/rev/957091874ea0

New changeset 78bf34b6a713 by Serhiy Storchaka in branch '2.7':
Issue #28871: Fixed a crash when deallocate deep ElementTree.
https://hg.python.org/cpython/rev/78bf34b6a713
msg284145 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-28 02:11
bm_xml_etree.py benchmark started to crash on Python 2.7 because of the change 78bf34b6a713.

Python 2.7 @ 78bf34b6a713: bug.py does crash
Python 2.7 @ 32cc37a89b58: no crash

Full script: https://github.com/python/performance/blob/master/performance/benchmarks/bm_xml_etree.py
msg284146 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-28 02:15
haypo@selma$ gdb -args ./python ~/bug.py 
(gdb) run
python: Objects/object.c:2453: _PyTrash_thread_deposit_object: Assertion `PyObject_IS_GC(op)' failed.

Program received signal SIGABRT, Aborted.

(gdb) py-bt
Traceback (most recent call first):
  File "/home/haypo/bug.py", line 130, in bench_parse
    root1 = etree.parse(xml_file).getroot()
  File "/home/haypo/bug.py", line 171, in bench_etree
    bench_func(etree, file_path, xml_data, xml_root)
  File "/home/haypo/bug.py", line 197, in <module>
    bench_etree(1, ET, bench_func)



(gdb) where
#0  0x00007ffff711892f in raise () from /lib64/libc.so.6
#1  0x00007ffff711a52a in abort () from /lib64/libc.so.6
#2  0x00007ffff7110e37 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007ffff7110ee2 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000000463abe in _PyTrash_thread_deposit_object (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2453
#5  0x00007fffecd29b17 in element_dealloc (self=0x7fffec9152e0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:566
#6  0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2262
#7  0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#8  0x00007fffecd29abc in element_dealloc (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#9  0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915280>) at Objects/object.c:2262
#10 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#11 0x00007fffecd29abc in element_dealloc (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#12 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915220>) at Objects/object.c:2262
#13 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#14 0x00007fffecd29abc in element_dealloc (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#15 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9151c0>) at Objects/object.c:2262
#16 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#17 0x00007fffecd29abc in element_dealloc (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#18 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915160>) at Objects/object.c:2262
#19 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#20 0x00007fffecd29abc in element_dealloc (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#21 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915100>) at Objects/object.c:2262
#22 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#23 0x00007fffecd29abc in element_dealloc (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#24 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9150a0>) at Objects/object.c:2262
#25 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#26 0x00007fffecd29abc in element_dealloc (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#27 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915040>) at Objects/object.c:2262
#28 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#29 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#30 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ffa0>) at Objects/object.c:2262
#31 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#32 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#33 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ff40>) at Objects/object.c:2262
#34 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#35 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#36 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fee0>) at Objects/object.c:2262
#37 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#38 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#39 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe80>) at Objects/object.c:2262
#40 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#41 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#42 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe20>) at Objects/object.c:2262
#43 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#44 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#45 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fdc0>) at Objects/object.c:2262
#46 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
#47 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
#48 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fd60>) at Objects/object.c:2262
#49 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd00) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
(...)
msg284155 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-28 07:13
Ah, I tested only with non-debug build in which asserts were ignored! In 2.7 Element doesn't support garbage collection, and the trashcan mechanism Py_TRASHCAN_SAFE_BEGIN/Py_TRASHCAN_SAFE_END can't be applied.

I see three alternatives:

1. Just revert the changes. Let deep ElementTree crashing.

2. Add the support of garbage collection. This will increase the size of empty Element by 1.5 times. This looks less appropriate that the first option since this harms working code.

3. Try to implement different mechanism. By using external list object as a stack or using other field for creating a linked list.

I'll revert the patch (except tests fix) and will try to implement different mechanism.
msg284203 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-28 22:12
This issue seems theorical to me, whereas the breakage of benchmarks is
very concrete for me. So I suggest to revert the change in Python 2.7.

(2) looks like the right design and it was implemented in Python 3 (no?).

I don't think that it's worth it to backport the change to Python 2. You
are the first one to report the issue and the backport is risky.
msg285365 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-13 07:31
Changes were reverted by 78bf34b6a713.

It is very uneasy to implement an alternative mechanism (without increasing the size of Element objects). It adds duplication of low level garbage collecting code. I think it is better to left all as is in 2.7. Yet one argument for moving to Python 3.
msg285367 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-13 07:50
> Yet one argument for moving to Python 3.

Yep, thanks ;-)
History
Date User Action Args
2022-04-11 14:58:40adminsetgithub: 73057
2017-03-31 16:36:23dstufftsetpull_requests: + pull_request975
2017-01-13 07:50:25vstinnersetmessages: + msg285367
2017-01-13 07:31:38serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg285365

stage: needs patch -> resolved
2016-12-28 22:12:12vstinnersetmessages: + msg284203
2016-12-28 07:13:14serhiy.storchakasetstage: resolved -> needs patch
messages: + msg284155
versions: - Python 3.5, Python 3.6, Python 3.7
2016-12-28 02:15:10vstinnersetmessages: + msg284146
2016-12-28 02:11:48vstinnersetstatus: closed -> open
files: + bug.py

nosy: + vstinner
messages: + msg284145

resolution: fixed -> (no value)
2016-12-21 10:57:11serhiy.storchakasetstatus: open -> closed
assignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2016-12-21 10:56:44python-devsetnosy: + python-dev
messages: + msg283739
2016-12-14 18:47:47serhiy.storchakasetfiles: + etree-trashcan.patch
keywords: + patch
messages: + msg283211

stage: needs patch -> patch review
2016-12-04 23:03:24serhiy.storchakacreate