classification
Title: ElementTree crash while cleaning up ParseError
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, jkloth, scoder, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2017-09-17 17:27 by scoder, last changed 2017-09-18 13:25 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
bug1.py vstinner, 2017-09-18 07:00
bug2.py vstinner, 2017-09-18 07:00
test_issue31499.diff serhiy.storchaka, 2017-09-18 09:37
Pull Requests
URL Status Linked Edit
PR 3641 merged vstinner, 2017-09-18 08:59
PR 3645 merged python-dev, 2017-09-18 12:29
Messages (17)
msg302375 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-17 17:27
I'm seeing crashes in the latest Py3.7 when I run this test (taken from lxml's compatibility test suite):

    etree = xml.etree.ElementTree

    def test_feed_parser_error_position(self):
        ParseError = etree.ParseError
        parser = XMLParser()
        try:
            parser.close()
        except ParseError:
            e = sys.exc_info()[1]
            self.assertNotEqual(None, e.code)
            self.assertNotEqual(0, e.code)
            self.assertTrue(isinstance(e.position, tuple))
            self.assertTrue(e.position >= (0, 0))

It crashes in expat/xmlparse.c, line 1464:

1459      for (;;) {
1460        BINDING *b = bindings;
1461        if (!b)
1462          break;
1463        bindings = b->nextTagBinding;
1464        FREE(b->uri);        // <<<<<<<<<<<<<<< crashes here
1465        FREE(b);
1466      }
1467    }

Probably related to the new expat version (issue 31170).
msg302383 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-17 19:18
I can't reproduce a crash. Could you please provide a Python script?
msg302384 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-17 20:06
Sorry, wrong line number. Was using an installed Py3.7, not a fresh build. However, my crashing installed version is from September 1st, *before* the expat update, which was apparently on September 5th.

With a clean debug build, I get a reproducible crash during shutdown, now with the correct line number:

Program received signal SIGSEGV, Segmentation fault.
PyExpat_XML_ParserFree (parser=0xd01a20) at .../Modules/expat/xmlparse.c:1487
1487        tagList = tagList->parent;
(gdb) p taglist
No symbol "taglist" in current context.
(gdb) list
1482            break;
1483          tagList = freeTagList;
1484          freeTagList = NULL;
1485        }
1486        p = tagList;
1487        tagList = tagList->parent;
1488        FREE(p->buf);
1489        destroyBindings(p->bindings, parser);
1490        FREE(p);
1491      }
(gdb) bt
#0  PyExpat_XML_ParserFree (parser=0xd01a20) at Modules/expat/xmlparse.c:1487
#1  0x00007fffeec83517 in xmlparser_gc_clear (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3414
#2  xmlparser_dealloc (self=0x7fffeeeb75d8) at Modules/_elementtree.c:3435
#3  0x000000000041fd40 in frame_dealloc (f=0xd017c8) at Objects/frameobject.c:428
#4  0x000000000044f5e5 in tb_dealloc (tb=0x7ffff632d218) at Python/traceback.c:56
#5  0x000000000049c59d in BaseException_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:78
#6  SyntaxError_clear (self=0x7ffff6567dd8) at Objects/exceptions.c:1381
#7  0x0000000000458104 in delete_garbage (old=<optimized out>, collectable=<optimized out>) at Modules/gcmodule.c:759
#8  collect (generation=generation@entry=2, n_collected=n_collected@entry=0x7fffffffd6c8, n_uncollectable=n_uncollectable@entry=0x7fffffffd6d0, nofail=nofail@entry=0) at Modules/gcmodule.c:911
#9  0x0000000000458d2b in collect_with_callback (generation=2) at Modules/gcmodule.c:1020
#10 PyGC_Collect () at Modules/gcmodule.c:1507
#11 0x0000000000439aaf in Py_FinalizeEx () at Python/pylifecycle.c:1002
#12 0x000000000043b3d8 in Py_Exit (sts=sts@entry=0) at Python/pylifecycle.c:1953
#13 0x000000000043f4e8 in handle_system_exit () at Python/pythonrun.c:608
#14 0x000000000043f94d in handle_system_exit () at Python/pythonrun.c:635
#15 PyErr_PrintEx (set_sys_last_vars=set_sys_last_vars@entry=1) at Python/pythonrun.c:618
#16 0x000000000043fc4a in PyErr_Print () at Python/pythonrun.c:514
#17 0x0000000000441475 in PyRun_SimpleFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:407
#18 0x0000000000441633 in PyRun_AnyFileExFlags (fp=fp@entry=0x9dc050, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffd8e0) at Python/pythonrun.c:83
#19 0x000000000045666c in run_file (p_cf=0x7fffffffd8e0, filename=0x9973d0 L"test.py", fp=0x9dc050) at Modules/main.c:341
#20 Py_Main (argc=argc@entry=6, argv=argv@entry=0x996020) at Modules/main.c:895
#21 0x000000000041fa2b in main (argc=6, argv=<optimized out>) at ./Programs/python.c:102

I get it in lxml's test suite runs, will try to reduce it to a test script.
msg302385 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-17 20:28
Minimal reproducer seems to be this:
----------
import xml.etree.ElementTree as etree

def test():
    parser = etree.XMLParser()
    try:
        parser.close()
    except etree.ParseError as exc:
        e = exc  # must keep local reference!

test()
----------

The master gitrev I tested is 132a7d7cdbc7cb89fa1c1f4e8192241c3d68f549

Since this happens during GC and finalisation, it's quite possible that it's not a bug in ET per se. It might just be a prematurely cleared reference in some tp_clear(), or something like that. (Changing ticket title appropriately.)
msg302411 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:00
I confirm the crash using attached bug1.py (first example, completed) and bug2.py (second example).
msg302412 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-18 07:13
Thanks for confirming, Victor.
I hadn't realised that the first update of expat was already back in June. That means it's not ruled out yet as a source of this crash. Bisecting is probably a good idea.
msg302414 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:28
The bug is at this line:

Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414


static int
xmlparser_gc_clear(XMLParserObject *self)
{
    EXPAT(ParserFree)(self->parser);   // <--- HERE
    ...
}

This function calls XML_ParserFree() twice on the same parser object. The first call is fine and frees the memory. Since we now use Python memory allocators, XML_ParserFree() fills the freed memory with 0xDB byte pattern (when Python is in debug mode).

The second XML_ParserFree() call uses freed memory (filled with 0xDB in debug mode).

Call 1: a GC collection

Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414
(gdb) up
#1  0x0000000000446636 in delete_garbage (collectable=0x7fffffffd9a0, old=0x9b8f90 <_PyRuntime+432>) at Modules/gcmodule.c:759
(gdb) up
#2  0x0000000000446ade in collect (generation=2, n_collected=0x7fffffffda30, n_uncollectable=0x7fffffffda28, nofail=0) at Modules/gcmodule.c:911
(gdb) cont
Continuing.


Call 2: xmlparser_dealloc()

Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414
(gdb) up
#1  0x00007ffff0038cb8 in xmlparser_dealloc (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3435


IMHO it's an obvious bug in Python. The question is more why/how the code didn't crash before? :-)
msg302415 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 07:32
Python 2.7 is not affected because it doesn't implement tp_clear (it doesn't have xmlparser_gc_clear()), only xmlparser_dealloc() calls EXPAT(ParserFree)(self->parser).

I'm unable to reproduce the bug in Python 3.5 nor 3.6.

bug2.py creates a reference cycle the "except etree.ParseError as exc: e = exc  # must keep local reference!" which requires to trigger a garbage collection to clear the "parser" variable.
msg302416 - (view) Author: Stefan Behnel (scoder) * Date: 2017-09-18 07:34
> The question is more why/how the code didn't crash before? :-)

Typical case of a Schroedinbug.
msg302426 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 08:45
>> The question is more why/how the code didn't crash before? :-)
>
> Typical case of a Schroedinbug.

I don't believe in the chaos :-)

I ran a "git bisect" since January 1st, 2017. Attached bug2.py started to crash since the following commit related to bpo-29049.
---
5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e is the first bad commit
commit 5a625d0aa6a6d9ec6574ee8344b41d63dcb9897e
Author: INADA Naoki <songofacandy@gmail.com>
Date:   Sat Dec 24 20:19:08 2016 +0900

    Issue #29049: Call _PyObject_GC_TRACK() lazily when calling Python function.
    
    Calling function is up to 5% faster.
---

This change is also correct. It's more that the change showed a bug which was hidden before.

It's just that now the garbage collector breaks the reference cycle differently since frames tracked differently by the GC.
msg302430 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:00
Attached PR 3641 fixes bug1.py and bug2.py crashes.

Sadly, I failed to write a reliable unit test using bug2.py. The bug requires to trigger the garbage collector in a specific order which depends on how frames are tracked by the GC...
msg302432 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-18 09:20
def test_issue31499(self):
    def test():
        ...
    test()
    test.support.gc_collect()
msg302434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:21
Serhiy Storchaka: I tried your pattern, but failed to write a reliable unit test. Can you please write a full patch / test?
msg302441 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 09:49
test_issue31499.diff: Oh great, it works (to reproduce the crash). I modified your test and included it in my PR, I added you as a co-author of my PR ;-)
msg302447 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 12:29
New changeset e727d41ffcd91b21ce82026ec8c8381d34a16209 by Victor Stinner in branch 'master':
bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (#3641)
https://github.com/python/cpython/commit/e727d41ffcd91b21ce82026ec8c8381d34a16209
msg302449 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 12:48
New changeset 8afd7ab12d7f8915b549cf04af384b495ec73d22 by Victor Stinner (Miss Islington (bot)) in branch '3.6':
[3.6] bpo-31499, xml.etree: Fix xmlparser_gc_clear() crash (GH-3641) (#3645)
https://github.com/python/cpython/commit/8afd7ab12d7f8915b549cf04af384b495ec73d22
msg302451 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-18 13:25
The bug is now fixed in Python 3.6 and master. Thanks for the bug report and analysis, Stefan Behnel!
History
Date User Action Args
2017-09-18 13:25:23vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg302451

stage: patch review -> resolved
2017-09-18 12:48:26vstinnersetmessages: + msg302449
2017-09-18 12:29:51python-devsetpull_requests: + pull_request3639
2017-09-18 12:29:40vstinnersetmessages: + msg302447
2017-09-18 09:49:01vstinnersetmessages: + msg302441
2017-09-18 09:37:05serhiy.storchakasetfiles: + test_issue31499.diff
2017-09-18 09:21:28vstinnersetmessages: + msg302434
2017-09-18 09:20:38serhiy.storchakasetmessages: + msg302432
2017-09-18 09:00:54vstinnersetmessages: + msg302430
versions: + Python 3.6
2017-09-18 08:59:06vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request3636
2017-09-18 08:45:05vstinnersetmessages: + msg302426
2017-09-18 07:34:00scodersetmessages: + msg302416
2017-09-18 07:32:39vstinnersetmessages: + msg302415
2017-09-18 07:28:28vstinnersetmessages: + msg302414
2017-09-18 07:13:52scodersetmessages: + msg302412
2017-09-18 07:00:17vstinnersetfiles: + bug2.py
2017-09-18 07:00:12vstinnersetfiles: + bug1.py

messages: + msg302411
2017-09-17 20:28:16scodersetmessages: + msg302385
title: ElementTree crash with new expat -> ElementTree crash while cleaning up ParseError
2017-09-17 20:06:12scodersetmessages: + msg302384
2017-09-17 19:18:39serhiy.storchakasetmessages: + msg302383
2017-09-17 18:50:07jklothsetnosy: + jkloth
2017-09-17 17:27:08scodercreate