New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ElementTree crash while cleaning up ParseError #75680
Comments
I'm seeing crashes in the latest Py3.7 when I run this test (taken from lxml's compatibility test suite): etree = xml.etree.ElementTree
def test_feed_parser_error_position(self):
ParseError = etree.ParseError
parser = XMLParser()
try:
parser.close()
except ParseError:
e = sys.exc_info()[1]
self.assertNotEqual(None, e.code)
self.assertNotEqual(0, e.code)
self.assertTrue(isinstance(e.position, tuple))
self.assertTrue(e.position >= (0, 0)) It crashes in expat/xmlparse.c, line 1464: 1459 for (;;) { Probably related to the new expat version (bpo-31170). |
I can't reproduce a crash. Could you please provide a Python script? |
Sorry, wrong line number. Was using an installed Py3.7, not a fresh build. However, my crashing installed version is from September 1st, *before* the expat update, which was apparently on September 5th. With a clean debug build, I get a reproducible crash during shutdown, now with the correct line number: Program received signal SIGSEGV, Segmentation fault. I get it in lxml's test suite runs, will try to reduce it to a test script. |
Minimal reproducer seems to be this: import xml.etree.ElementTree as etree
def test():
parser = etree.XMLParser()
try:
parser.close()
except etree.ParseError as exc:
e = exc # must keep local reference!
test() The master gitrev I tested is 132a7d7 Since this happens during GC and finalisation, it's quite possible that it's not a bug in ET per se. It might just be a prematurely cleared reference in some tp_clear(), or something like that. (Changing ticket title appropriately.) |
I confirm the crash using attached bug1.py (first example, completed) and bug2.py (second example). |
Thanks for confirming, Victor. |
The bug is at this line: Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 static int
xmlparser_gc_clear(XMLParserObject *self)
{
EXPAT(ParserFree)(self->parser); // <--- HERE
...
} This function calls XML_ParserFree() twice on the same parser object. The first call is fine and frees the memory. Since we now use Python memory allocators, XML_ParserFree() fills the freed memory with 0xDB byte pattern (when Python is in debug mode). The second XML_ParserFree() call uses freed memory (filled with 0xDB in debug mode). Call 1: a GC collection Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 Call 2: xmlparser_dealloc() Breakpoint 6, xmlparser_gc_clear (self=0x7ffff7e28c08) at /home/haypo/prog/python/master/Modules/_elementtree.c:3414 IMHO it's an obvious bug in Python. The question is more why/how the code didn't crash before? :-) |
Python 2.7 is not affected because it doesn't implement tp_clear (it doesn't have xmlparser_gc_clear()), only xmlparser_dealloc() calls EXPAT(ParserFree)(self->parser). I'm unable to reproduce the bug in Python 3.5 nor 3.6. bug2.py creates a reference cycle the "except etree.ParseError as exc: e = exc # must keep local reference!" which requires to trigger a garbage collection to clear the "parser" variable. |
Typical case of a Schroedinbug. |
I don't believe in the chaos :-) I ran a "git bisect" since January 1st, 2017. Attached bug2.py started to crash since the following commit related to bpo-29049.
--- This change is also correct. It's more that the change showed a bug which was hidden before. It's just that now the garbage collector breaks the reference cycle differently since frames tracked differently by the GC. |
Attached PR 3641 fixes bug1.py and bug2.py crashes. Sadly, I failed to write a reliable unit test using bug2.py. The bug requires to trigger the garbage collector in a specific order which depends on how frames are tracked by the GC... |
def test_issue31499(self):
def test():
...
test()
test.support.gc_collect() |
Serhiy Storchaka: I tried your pattern, but failed to write a reliable unit test. Can you please write a full patch / test? |
test_issue31499.diff: Oh great, it works (to reproduce the crash). I modified your test and included it in my PR, I added you as a co-author of my PR ;-) |
The bug is now fixed in Python 3.6 and master. Thanks for the bug report and analysis, Stefan Behnel! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: