This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ElementTree segmentation fault in expat_start_ns_handler
Type: crash Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Yann.Diorcet, christian.heimes, eli.bendersky, fdrake, python-dev, scoder, vajrasky, vstinner
Priority: normal Keywords: patch

Created on 2013-11-27 17:26 by Yann.Diorcet, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
trace Yann.Diorcet, 2013-11-27 17:26 Trace of the gdb backtrace
aa.tar.gz Yann.Diorcet, 2013-11-27 20:40 An example
empty_uri.patch christian.heimes, 2013-11-27 22:20 review
fix_xml_etree_with_empty_namespace.patch vajrasky, 2013-11-28 07:18 The fix is by Christian Heimes. The unit test is by Vajrasky Kok. review
Messages (9)
msg204601 - (view) Author: Yann Diorcet (Yann.Diorcet) Date: 2013-11-27 17:26
I fell on a bug in ElementTree of Python 2.7.5 (default, Nov 12 2013, 16:18:04)

The bug seems to be here: http://hg.python.org/cpython/file/ab05e7dd2788/Modules/_elementtree.c#l2341

uri is NULL and not checked before be passed to strlen

Maybe linked to my expat version: 
expat.i686                             2.1.0-5.fc19                    @fedora  
expat.x86_64                           2.1.0-5.fc19                    @anaconda
msg204602 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-27 17:28
Can you please provide use the script wadl.py? Or if it's not possible, can you please try to write a short Python script reproducing the crash?
msg204603 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-11-27 17:34
Indeed, uri might be null: http://hg.python.org/cpython/file/ab05e7dd2788/Modules/expat/xmlparse.c#l3157
msg204617 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-27 21:47
Thanks, I'm able to reproduce the crash using aa.tar.gz.

Python traceback on the crash:

(gdb) py-bt
Traceback (most recent call first):
  File "/home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py", line 1235, in feed
    self._parser.feed(data)
  File "/home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py", line 1304, in __next__
    self._parser.feed(data)
  File "abcd.py", line 18, in _from_stream
    for event, elem in ET.iterparse(stream, events):
  File "abcd.py", line 30, in <module>
    _from_stream("aa.wadl")

C traceback in gdb:

(gdb) where
#0  0x00007ffff7258491 in __strlen_sse2_pminub () from /lib64/libc.so.6

#1  0x00007ffff06124d8 in expat_start_ns_handler (self=0x7ffff0ad6d68, prefix=0x0, uri=0x0) at /home/haypo/prog/python/default/Modules/_elementtree.c:3041

#2  0x00007ffff03d7fc7 in addBinding (parser=0xa8eea0, prefix=0x7ffff7f31c28, attId=0x7ffff0bbc190, uri=0xaa3720 "", bindingsPtr=0x7fffffff6b58) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3158

#3  0x00007ffff03d6de0 in storeAtts (parser=0xa8eea0, enc=0x7ffff06011e0 <utf8_encoding_ns>, attStr=0xaa4170 "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/2009/02\" xmlns=\"\" element=\"org\" mediaType=\"application/xml\"/>\n", ' ' <repeats 24 times>, "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/20"..., tagNamePtr=0x7fffffff6b10, bindingsPtr=0x7fffffff6b58) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2820

#4  0x00007ffff03d5b3f in doContent (parser=0xa8eea0, startTagLevel=0, enc=0x7ffff06011e0 <utf8_encoding_ns>, s=0xaa4170 "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/2009/02\" xmlns=\"\" element=\"org\" mediaType=\"application/xml\"/>\n", ' ' <repeats 24 times>, "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/20"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0, haveMore=1 '\001') at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2464

#5  0x00007ffff03d4b7e in contentProcessor (parser=0xa8eea0, start=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n        </include>\n    </gra"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., endPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2105

#6  0x00007ffff03d9d54 in doProlog (parser=0xa8eea0, enc=0x7ffff06011e0 <utf8_encoding_ns>, s=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n        </include>\n    </gra"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., tok=29, next=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n        </include>\n    </gra"..., nextPtr=0xa8eed0, haveMore=1 '\001') at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:4016

#7  0x00007ffff03d9213 in prologProcessor (parser=0xa8eea0, s=0xaa3fa0 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3739

#8  0x00007ffff03d8cdf in prologInitProcessor (parser=0xa8eea0, s=0xaa3fa0 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3556

#9  0x00007ffff03d3e6a in XML_ParseBuffer (parser=0xa8eea0, len=858, isFinal=0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:1651

#10 0x00007ffff03d3d30 in XML_Parse (parser=0xa8eea0, s=0xaa3330 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., len=858, isFinal=0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:1617

#11 0x00007ffff0614356 in expat_parse (self=0x7ffff0ad6d68, data=0xaa3330 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\">\n    <grammars>\n        <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., data_len=858, final=0) at /home/haypo/prog/python/default/Modules/_elementtree.c:3351

#12 0x00007ffff061470c in xmlparser_feed (self=0x7ffff0ad6d68, arg=b'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<application xmlns="http://wadl.dev.java.net/2009/02">\n    <grammars>\n        <include href="application.wadl/xsd0.xsd">\n            <doc title="Generated" xml:lang="en"/>\n        </include>\n    </grammars>\n    <resources base="sdfdsfdsf">\n        <resource>\n            <resource path="dfdfsddsf">\n                <method id="usersdfsfdsdf" name="PUT">\n                    <request>\n                        <ns2:representation xmlns:ns2="http://wadl.dev.java.net/2009/02" xmlns="" element="org" mediaType="application/xml"/>\n                        <ns2:representation xmlns:ns2="http://wadl.dev.java.net/2009/02" xmlns="" element="org" mediaType="application/json"/>\n                    </request>\n                </method>\n            </resource>\n        </resource>\n    </resources>\n</application>\n') at /home/haypo/prog/python/default/Modules/_elementtree.c:3423

#13 0x00000000005ad4fe in call_function (pp_stack=0x7fffffff7128, oparg=1) at Python/ceval.c:4212

#14 0x00000000005a5d29 in PyEval_EvalFrameEx (f=Frame 0xa38c18, for file /home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py, line 1235, in feed (self=<XMLPullParser(_events_queue=[('start-ns', ('', 'http://wadl.dev.java.net/2009/02')), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff0bb3858>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff088b458>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bb58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bc58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bd58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081be58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bed8>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bf58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff08200d8>), ('start-ns', ('ns2', 'http://wadl.dev.java.net/2009/02'))], _parser=<xml.etree.ElementTree.XMLParser at remote 0x7ffff0ad6d68>, _index=0) at remote 0x7ffff0bce468>, data=b'<?xml version="1.0" encoding="UTF...(truncated), throwflag=0) at Python/ceval.c:2826
msg204623 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-11-27 22:20
The patch removes the cause of the segfault but I'm no sure if that's the right way. I'm adding Eli und Stefan to the ticket.
msg204644 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-11-28 07:18
Here is the patch (by Christian Heimes) with unit test (by me).

Apparently the namespace handlers (start-ns and end-ns) got problem with empty namespace. But they (start-ns and end-ns) must be combined together to create this problem. start-ns handler only will not create this problem.
msg204656 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-28 14:25
New changeset 395a266bcb5a by Eli Bendersky in branch '2.7':
Issue #19815: Fix segfault when parsing empty namespace declaration.
http://hg.python.org/cpython/rev/395a266bcb5a
msg204659 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-28 14:35
New changeset 68f1e5262a7a by Eli Bendersky in branch '3.3':
Issue #19815: Fix segfault when parsing empty namespace declaration.
http://hg.python.org/cpython/rev/68f1e5262a7a

New changeset 2b2925c08a6c by Eli Bendersky in branch 'default':
Issue #19815: Fix segfault when parsing empty namespace declaration.
http://hg.python.org/cpython/rev/2b2925c08a6c
msg204661 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2013-11-28 14:36
Thanks for the report & patches. Fixed in all active branches.
History
Date User Action Args
2022-04-11 14:57:54adminsetgithub: 64014
2013-11-28 14:36:16eli.benderskysetstatus: open -> closed
resolution: fixed
messages: + msg204661

stage: needs patch -> resolved
2013-11-28 14:35:44python-devsetmessages: + msg204659
2013-11-28 14:25:48python-devsetnosy: + python-dev
messages: + msg204656
2013-11-28 07:18:28vajraskysetfiles: + fix_xml_etree_with_empty_namespace.patch
nosy: + vajrasky
messages: + msg204644

2013-11-27 22:20:56vstinnersetnosy: + fdrake
2013-11-27 22:20:00christian.heimessetfiles: + empty_uri.patch

nosy: + scoder, eli.bendersky
messages: + msg204623

keywords: + patch
2013-11-27 21:47:07vstinnersetmessages: + msg204617
2013-11-27 20:40:39Yann.Diorcetsetfiles: + aa.tar.gz
2013-11-27 17:34:31christian.heimessetversions: + Python 3.3, Python 3.4
nosy: + christian.heimes

messages: + msg204603

stage: needs patch
2013-11-27 17:28:16vstinnersetnosy: + vstinner
messages: + msg204602
2013-11-27 17:27:02Yann.Diorcetsettype: crash
2013-11-27 17:26:48Yann.Diorcetcreate