classification
Title: expat infinite loop
Type: behavior Stage:
Components: XML Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: StyXman
Priority: normal Keywords:

Created on 2019-10-15 17:12 by StyXman, last changed 2019-10-15 17:12 by StyXman.

Files
File name Uploaded Description Edit
expat.tar.gz StyXman, 2019-10-15 17:12 tarball with test script and files
Messages (1)
msg354747 - (view) Author: Marcos Dione (StyXman) * Date: 2019-10-15 17:12
I'm trying to add external entities support to xmltodict[1]. For that I extended the handler to have a ExternalEntityRefHandler handler. After reading a couple of files, the script lock in a tight loop.

I ran the script with gdb (!!) and found that expat think that two of the parsers are parent of each other. I setup a breakpoint in XML_ExternalEntityParserCreate() (yes, this is expat, I know) right after the new parser uses the old parser as parent (xmlparse.c:1279 in my system).

Here are the backtraces and values I found:

--- >8 ---
landuse-lowzoom None styles-otm/landuse-lowzoom.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xadc4d0, context=context@entry=0x7ffff6c871e0 "landuse-lowzoom", encodingName=encodingName@entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c871e0 "landuse-lowzoom", self=0x7ffff6d556e0) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d556e0, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse-lowzoom", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>)
    at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser@entry=0xadc4d0, startTagLevel=startTagLevel@entry=0, enc=<optimized out>,
    s=s@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr@entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser@entry=0xadc4d0,
    start=start@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr@entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser@entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s@entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len@entry=2048, isFinal=isFinal@entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$33 = (XML_Parser) 0xadc4d0
(gdb) print parser
$32 = (XML_Parser) 0xadecb0

7ffff6d556e0, 7ffff6d55750
<_io.BufferedReader name='styles-otm/landuse-lowzoom.xml'>
<_io.BufferedReader name='styles-otm/landuse-lowzoom.xml'>
landuse None styles-otm/landuse.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xadecb0, context=context@entry=0x7ffff6c88660 "landuse", encodingName=encodingName@entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c88660 "landuse", self=0x7ffff6d55750) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d55750, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>) at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser@entry=0xadc4d0, startTagLevel=startTagLevel@entry=0, enc=<optimized out>,
    s=s@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr@entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser@entry=0xadc4d0,
    start=start@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr@entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser@entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s@entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len@entry=2048, isFinal=isFinal@entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$35 = (XML_Parser) 0xadecb0
(gdb) print parser
$34 = (XML_Parser) 0xae5e00

7ffff6d55750, 7ffff6d557c0
<_io.BufferedReader name='styles-otm/landuse.xml'>
<_io.BufferedReader name='styles-otm/landuse.xml'>
landuse-over-hillshade None styles-otm/landuse-over-hillshade.xml None

#0  XML_ExternalEntityParserCreate (oldParser=0xae5e00, context=context@entry=0x7ffff6c81a60 "landuse-over-hillshade", encodingName=encodingName@entry=0x0) at ../../src/lib/xmlparse.c:1281
#1  0x000000000044ec90 in pyexpat_xmlparser_ExternalEntityParserCreate_impl (encoding=0x0, context=0x7ffff6c81a60 "landuse-over-hillshade", self=0x7ffff6d557c0) at ../Modules/pyexpat.c:943
#2  pyexpat_xmlparser_ExternalEntityParserCreate (self=0x7ffff6d557c0, args=<optimized out>, nargs=<optimized out>) at ../Modules/clinic/pyexpat.c.h:137
[...]
#15 0x000000000044d80d in my_ExternalEntityRefHandler (parser=<optimized out>, context=0xae1d2c "landuse-over-hillshade", base=<optimized out>, systemId=<optimized out>, publicId=<optimized out>)
    at ../Modules/pyexpat.c:659
#16 0x00007ffff7d990c8 in doContent (parser=parser@entry=0xadc4d0, startTagLevel=startTagLevel@entry=0, enc=<optimized out>,
    s=s@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=nextPtr@entry=0xadc500, haveMore=1 '\001') at ../../src/lib/xmlparse.c:2685
#17 0x00007ffff7d9957c in contentProcessor (parser=parser@entry=0xadc4d0,
    start=start@entry=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., endPtr=endPtr@entry=0xadc500) at ../../src/lib/xmlparse.c:2444
#18 0x00007ffff7d96a73 in doProlog (parser=parser@entry=0xadc4d0, enc=0x7ffff7db89e0 <utf8_encoding>,
    s=0xae08dd "<Map background-color=\"#e0e0e0\" srs=\"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over\" buffer-size=\"256\">\n\t<!-- style definitions "...,
    s@entry=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=end@entry=0xae0ce6 '\001' <repeats 200 times>..., tok=29, next=<optimized out>, nextPtr=0xadc500, haveMore=1 '\001', allowClosingDoctype=1 '\001') at ../../src/lib/xmlparse.c:4371
#19 0x00007ffff7d97f3a in prologProcessor (parser=0xadc4d0,
    s=0xae04e4 "text-water-lowzoom SYSTEM \"styles-otm/text-water-lowzoom.xml\">\n\t<!ENTITY text-glacier-lowzoom SYSTEM \"styles-otm/text-glacier-lowzoom.xml\">\n\t<!ENTITY text-natural-poly SYSTEM \"styles-otm/text-natural-"..., end=0xae0ce6 '\001' <repeats 200 times>..., nextPtr=0xadc500) at ../../src/lib/xmlparse.c:4094
#20 0x00007ffff7d9bb1c in XML_ParseBuffer (isFinal=0, len=<optimized out>, parser=0xadc4d0) at ../../src/lib/xmlparse.c:1893
#21 XML_ParseBuffer (parser=0xadc4d0, len=len@entry=2048, isFinal=isFinal@entry=0) at ../../src/lib/xmlparse.c:1863
#22 0x000000000060886d in pyexpat_xmlparser_ParseFile (self=0x7ffff6d556e0, file=<optimized out>) at ../Modules/pyexpat.c:841

(gdb) print oldParser
$36 = (XML_Parser) 0xae5e00
(gdb) print parser
$37 = (XML_Parser) 0xadecb0
--- 8< ---

As I hope you can see, the last two values (parent 0xae5e00, new 0xadecb0) are the exact opposite of the previous one (parent 0xadecb0, new 0xae5e00). Later, when get_hash_secret_salt() is called, it enters in a infinite loop climbing up the parent ladder.

Now, this looks like an expat issue ands not pyexpat, but given that pyexpat provides its own allocator, and that the parser addresses are returned by that, I will start opening this issue here. If it can be proven that it's an expat issue, I'll take it to their issue tracker.

-----
[1] https://github.com/martinblech/xmltodict/issues/226
History
Date User Action Args
2019-10-15 17:12:40StyXmancreate