Issue12829
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-08-23 22:45 by dhgutteridge, last changed 2022-04-11 14:57 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
pyexpat_crash_isolation_osx.py | dhgutteridge, 2011-08-23 22:44 | |||
pyexpat_crash_isolation_nb.py | dhgutteridge, 2011-08-30 04:58 |
Messages (12) | |||
---|---|---|---|
msg142868 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-23 22:44 | |
I stumbled across this bug because of a misunderstanding I had about how the pyexpat module works. I'd inferred that a given instance could be reused to parse multiple files, which is apparently not supported. (There's already a documentation bug open on this, see http://bugs.python.org/issue6676 -- a few other people made the same mistaken assumption as me.) I found that given the right input, a segmentation fault occurs when one attempts to reuse the parser instance on more than one file. The sample test case I've attached derives from what I'm using pyexpat for, which involves the parsing of Microsoft Office Open XML Excel files. I found that the specific content in the initial file can influence whether the submission of subsequent files triggers a segmentation fault. I'm reporting this against Python 2.7.2 on Mac OS X 10.6.8; it also occurs with Python 2.6.1 that's bundled with the OS. I can also duplicate it on the development branch of NetBSD (my other development platform), specifically 5.99.47/amd64 with Python 2.6.7. |
|||
msg142869 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-23 23:10 | |
I believe this may be an OS-specific bug somehow, albeit one that affects multiple OSes. I cannot duplicate the crash on NetBSD 5.1_STABLE/i386 with Python 2.6.7, or on OpenSuSE 11.3 with Python 2.6.5. (It's interesting that it doesn't crash on the older branch of NetBSD, but it does on the newer, both with the same version of Python and underlying Expat...) |
|||
msg142870 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-23 23:15 | |
Here's the (non-debug) trace under OS X: Process: Python [4604] Path: /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? (???) Code Type: X86-64 (Native) Parent Process: bash [1461] Date/Time: 2011-08-23 19:14:48.148 -0400 OS Version: Mac OS X 10.6.8 (10K549) Report Version: 6 Interval Since Last Report: 366485 sec Crashes Since Last Report: 29 Per-App Crashes Since Last Report: 29 Anonymous UUID: 5504B203-8C24-427A-B74C-EDBD3EF8DB51 Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000100569000 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Thread 0 Crashed: Dispatch queue: com.apple.main-thread 0 pyexpat.so 0x000000010050e439 normal_updatePosition + 57 1 pyexpat.so 0x00000001004f9314 PyExpat_XML_GetCurrentLineNumber + 84 2 pyexpat.so 0x00000001004f374e set_error + 62 3 pyexpat.so 0x00000001004f4588 xmlparse_Parse + 200 4 org.python.python 0x00000001000c102d PyEval_EvalFrameEx + 22397 5 org.python.python 0x00000001000c2d29 PyEval_EvalCodeEx + 2137 6 org.python.python 0x00000001000c0b6a PyEval_EvalFrameEx + 21178 7 org.python.python 0x00000001000c2d29 PyEval_EvalCodeEx + 2137 8 org.python.python 0x00000001000c2e46 PyEval_EvalCode + 54 9 org.python.python 0x00000001000e7b6e PyRun_FileExFlags + 174 10 org.python.python 0x00000001000e7e29 PyRun_SimpleFileExFlags + 489 11 org.python.python 0x00000001000fe77c Py_Main + 2940 12 org.python.python 0x0000000100000f14 0x100000000 + 3860 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x00000000fffffffb rbx: 0x000000010037bbc0 rcx: 0x000000010037bec8 rdx: 0x00000001008cd39f rdi: 0x00000001005256c0 rsi: 0x0000000100569000 rbp: 0x00007fff5fbfedf0 rsp: 0x00007fff5fbfedf0 r8: 0x000000010050e458 r9: 0x00000001008caa00 r10: 0x0000000000000800 r11: 0x0000000100542dda r12: 0x0000000000000000 r13: 0x00000001003037e0 r14: 0x0000000000000009 r15: 0x00000001004aca70 rip: 0x000000010050e439 rfl: 0x0000000000010293 cr2: 0x0000000100569000 Binary Images: 0x100000000 - 0x100000fff +org.python.python 2.7.2 (2.7.2) <639E72E4-F205-C034-8E34-E59DE9C46369> /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python 0x100003000 - 0x10016cfef +org.python.python 2.7.2, (c) 2004-2011 Python Software Foundation. (2.7.2) <49D18B1A-C92D-E32E-A7C1-086D0B14BD76> /Library/Frameworks/Python.framework/Versions/2.7/Python 0x1002ec000 - 0x1002efff7 +strop.so ??? (???) <F7857283-F427-7CF7-9B0D-7619AA0A82F1> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/strop.so 0x1004f0000 - 0x100524fe7 +pyexpat.so ??? (???) <E5FD4237-8D59-8B8E-E229-19601A03F18E> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/pyexpat.so 0x7fff5fc00000 - 0x7fff5fc3bdef dyld 132.1 (???) <B536F2F1-9DF1-3B6C-1C2C-9075EA219A06> /usr/lib/dyld 0x7fff8005d000 - 0x7fff801d4fe7 com.apple.CoreFoundation 6.6.5 (550.43) <31A1C118-AD96-0A11-8BDF-BD55B9940EDC> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 0x7fff822f0000 - 0x7fff824b1fef libSystem.B.dylib 125.2.11 (compatibility 1.0.0) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib 0x7fff82781000 - 0x7fff82792ff7 libz.1.dylib 1.2.3 (compatibility 1.0.0) <FB5EE53A-0534-0FFA-B2ED-486609433717> /usr/lib/libz.1.dylib 0x7fff8376d000 - 0x7fff837eafef libstdc++.6.dylib 7.9.0 (compatibility 7.0.0) <35ECA411-2C08-FD7D-11B1-1B7A04921A5C> /usr/lib/libstdc++.6.dylib 0x7fff85577000 - 0x7fff8557bff7 libmathCommon.A.dylib 315.0.0 (compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> /usr/lib/system/libmathCommon.A.dylib 0x7fff86259000 - 0x7fff86417fff libicucore.A.dylib 40.0.0 (compatibility 1.0.0) <4274FC73-A257-3A56-4293-5968F3428854> /usr/lib/libicucore.A.dylib 0x7fff86526000 - 0x7fff865dcff7 libobjc.A.dylib 227.0.0 (compatibility 1.0.0) <03140531-3B2D-1EBA-DA7F-E12CC8F63969> /usr/lib/libobjc.A.dylib 0x7fff8739a000 - 0x7fff873e6fff libauto.dylib ??? (???) <F7221B46-DC4F-3153-CE61-7F52C8C293CF> /usr/lib/libauto.dylib 0x7fffffe00000 - 0x7fffffe01fff libSystem.B.dylib ??? (???) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib |
|||
msg143115 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2011-08-28 18:27 | |
A note for anyone else: David is actually using the xml.parsers.expat module, which uses the now undocumented pyexpat module, whose direct use is deprecated. David: Have you tested with 3.1 or 3.2? (I am about to try on Windows ;-). |
|||
msg143116 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2011-08-28 18:32 | |
Running with IDLE on Windows, I get no crash or uncaught exception but got these printed lines: An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Line number: 1 An error occurred during XML parsing. Error ID: 9. Error message: junk after document element Is this the correct, expected output? |
|||
msg143197 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-30 04:22 | |
Terry: I wasn't aware xml.parsers.expat is deprecated, though it clearly says so in the documentation, I now see... (I'd been using it because it features prominently in various examples in Python books, and it's lightweight.) I haven't tested with the 3.x series, because I rely on the 2.6 branch as a dependency for a variety of software on NetBSD, but having said that, I can test it on Mac OS X. Your test output is the correct, expected results, yes. |
|||
msg143198 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-30 04:37 | |
Confirming that Python 3.2.1 crashes the same way on Mac OS X 10.6.8: Process: Python [9594] Path: /Library/Frameworks/Python.framework/Versions/3.2/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? (???) Code Type: X86-64 (Native) Parent Process: bash [9570] Date/Time: 2011-08-30 00:35:53.863 -0400 OS Version: Mac OS X 10.6.8 (10K549) Report Version: 6 Interval Since Last Report: 292720 sec Crashes Since Last Report: 2 Per-App Crashes Since Last Report: 2 Anonymous UUID: 5504B203-8C24-427A-B74C-EDBD3EF8DB51 Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x00000001006fb000 Crashed Thread: 0 Dispatch queue: com.apple.main-thread Thread 0 Crashed: Dispatch queue: com.apple.main-thread 0 pyexpat.so 0x00000001006a03e9 normal_updatePosition + 57 1 pyexpat.so 0x000000010068b2c4 PyExpat_XML_GetCurrentLineNumber + 84 2 pyexpat.so 0x000000010068673e set_error + 62 3 pyexpat.so 0x00000001006874e8 xmlparse_Parse + 200 4 org.python.python 0x00000001000b39b2 PyEval_EvalFrameEx + 30530 5 org.python.python 0x00000001000b2a4d PyEval_EvalFrameEx + 26589 6 org.python.python 0x00000001000b431a PyEval_EvalCodeEx + 1770 7 org.python.python 0x00000001000b462f PyEval_EvalCode + 63 8 org.python.python 0x00000001000db82b PyRun_FileExFlags + 187 9 org.python.python 0x00000001000dbaf9 PyRun_SimpleFileExFlags + 521 10 org.python.python 0x00000001000f0a03 Py_Main + 3059 11 org.python.python 0x0000000100000e5f 0x100000000 + 3679 12 org.python.python 0x0000000100000d04 0x100000000 + 3332 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x00000000fffffffb rbx: 0x00000001003a9b40 rcx: 0x00000001003a9e48 rdx: 0x000000010093b59f rdi: 0x00000001006b76e0 rsi: 0x00000001006fb000 rbp: 0x00007fff5fbfed60 rsp: 0x00007fff5fbfed60 r8: 0x00000001006a0408 r9: 0x00000001008cb400 r10: 0x0000000000000800 r11: 0x00000001006d4dda r12: 0x0000000000000000 r13: 0x00000001005aa5f0 r14: 0x0000000000000009 r15: 0x00000001002b6810 rip: 0x00000001006a03e9 rfl: 0x0000000000010293 cr2: 0x00000001006fb000 Binary Images: 0x100000000 - 0x100000ff7 +org.python.python 3.2.1 (3.2.1) <B2AFB510-C20A-61C8-C375-448C252C66A8> /Library/Frameworks/Python.framework/Versions/3.2/Resources/Python.app/Contents/MacOS/Python 0x100003000 - 0x100182ff7 +org.python.python 3.2.1, (c) 2004-2011 Python Software Foundation. (3.2.1) <9A9D8FC9-0EA2-8B57-D918-373F60ECF77A> /Library/Frameworks/Python.framework/Versions/3.2/Python 0x1002fc000 - 0x1002fcfff +_bisect.so ??? (???) <25A7A434-1970-9B41-5BFD-31B6F7AD6ECF> /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/lib-dynload/_bisect.so 0x1005b0000 - 0x1005b1ff7 +_heapq.so ??? (???) <3E54D664-5279-8504-CA26-E23A15CF152D> /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/lib-dynload/_heapq.so 0x100682000 - 0x1006b6fef +pyexpat.so ??? (???) <F5A9710C-3B05-3BA8-66E1-5D34290441CA> /Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/lib-dynload/pyexpat.so 0x7fff5fc00000 - 0x7fff5fc3bdef dyld 132.1 (???) <B536F2F1-9DF1-3B6C-1C2C-9075EA219A06> /usr/lib/dyld 0x7fff8005d000 - 0x7fff801d4fe7 com.apple.CoreFoundation 6.6.5 (550.43) <31A1C118-AD96-0A11-8BDF-BD55B9940EDC> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 0x7fff822f0000 - 0x7fff824b1fef libSystem.B.dylib 125.2.11 (compatibility 1.0.0) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib 0x7fff82781000 - 0x7fff82792ff7 libz.1.dylib 1.2.3 (compatibility 1.0.0) <FB5EE53A-0534-0FFA-B2ED-486609433717> /usr/lib/libz.1.dylib 0x7fff8376d000 - 0x7fff837eafef libstdc++.6.dylib 7.9.0 (compatibility 7.0.0) <35ECA411-2C08-FD7D-11B1-1B7A04921A5C> /usr/lib/libstdc++.6.dylib 0x7fff85577000 - 0x7fff8557bff7 libmathCommon.A.dylib 315.0.0 (compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> /usr/lib/system/libmathCommon.A.dylib 0x7fff86259000 - 0x7fff86417fff libicucore.A.dylib 40.0.0 (compatibility 1.0.0) <4274FC73-A257-3A56-4293-5968F3428854> /usr/lib/libicucore.A.dylib 0x7fff86526000 - 0x7fff865dcff7 libobjc.A.dylib 227.0.0 (compatibility 1.0.0) <03140531-3B2D-1EBA-DA7F-E12CC8F63969> /usr/lib/libobjc.A.dylib 0x7fff8739a000 - 0x7fff873e6fff libauto.dylib ??? (???) <F7221B46-DC4F-3153-CE61-7F52C8C293CF> /usr/lib/libauto.dylib 0x7fffffe00000 - 0x7fffffe01fff libSystem.B.dylib ??? (???) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib |
|||
msg143200 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-08-30 04:58 | |
Further details: - The original test case I'd submitted crashed on the development branch of NetBSD as well as Mac OS X Snow Leopard, but not the most recent stable branch of NetBSD. I've found a separate test case that crashes on both branches of NetBSD, but not OS X... This is quite possibly a separate bug, but the means of triggering it is directly related, so I'm including it here. - I also built Python 2.7.2 under Solaris to see if either test case resulted in a crash there, and they do not, so it seems this is BSDish somehow (or else, the Mac OS X and NetBSD crashes are two separate bugs). - With NetBSD, I also created tests in C that use the Expat library directly, submitting the very same test data, and they do not crash, they return the expected results, so it appears there's definitely something happening in Python somewhere that's causing this. This is the (non-debug) crash trace from the separate NetBSD test. (I will look at building a debug version of Python when I get a chance...) I'm running Python 2.6.7 on the NetBSD machines. #0 0xbb93ff64 in XML_ParserCreate () from /usr/X11R7/lib/libexpat.so.1 #1 0xbb9348a3 in XML_GetCurrentLineNumber () from /usr/X11R7/lib/libexpat.so.1 #2 0xbb956743 in set_error () from /usr/pkg/lib/python2.6/site-packages/pyexpat.so #3 0xbb956d21 in xmlparse_Parse () from /usr/pkg/lib/python2.6/site-packages/pyexpat.so #4 0xbbb048b0 in PyCFunction_Call () from /usr/pkg/lib/libpython2.6.so.1.0 #5 0xbbb5a3d7 in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.6.so.1.0 #6 0xbbb5add8 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.6.so.1.0 #7 0xbbb5914e in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.6.so.1.0 #8 0xbbb5add8 in PyEval_EvalCodeEx () from /usr/pkg/lib/libpython2.6.so.1.0 #9 0xbbb5ae22 in PyEval_EvalCode () from /usr/pkg/lib/libpython2.6.so.1.0 #10 0xbbb72f12 in run_mod () from /usr/pkg/lib/libpython2.6.so.1.0 #11 0xbbb72fb5 in PyRun_FileExFlags () from /usr/pkg/lib/libpython2.6.so.1.0 #12 0xbbb745e4 in PyRun_SimpleFileExFlags () from /usr/pkg/lib/libpython2.6.so.1.0 #13 0xbbb74ce5 in PyRun_AnyFileExFlags () from /usr/pkg/lib/libpython2.6.so.1.0 #14 0xbbb80322 in Py_Main () from /usr/pkg/lib/libpython2.6.so.1.0 #15 0x080487e9 in main () |
|||
msg143223 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2011-08-30 16:35 | |
My understanding is that what you did: import xml.parsers.expat is now the proper way to use expat. After some searching, it seems the sentence about direct use of pyexpat being deprecated refers to http://sourceforge.net/tracker/?func=detail&aid=2745230&group_id=26590&atid=387667 "The location and name of the PyExpat module have moved in Python v2.6.1 from xml.dom.ext.reader.PyExpat to xml.parsers.expat" This is puzzling becasue xmo.parsers.expat dates back to 2.0 while I see no doc for xml.dom.ext... . The deprecation notice should be deleted from the 3.x docs. |
|||
msg143224 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2011-08-30 16:39 | |
This seems to be a Mac-only issue. Barry, does this seem to be a security issue to you, or should we delete 2.6 from the versions? |
|||
msg143241 - (view) | Author: Ned Deily (ned.deily) * | Date: 2011-08-30 23:20 | |
This is the same issue as highlighted by Issue6676. The root cause is attempting to reuse a parser instance and that is known to not work with the version of expat included with Python. Whether the test program crashes with a memory access violation or just uses uninitialized memory depends on the version of malloc in use and what protections the linker and os use. Even on Mac OS X, the test program does not segfault on earlier versions of OS X (like 10.5). And on 10.6 and 10.7 if you build python with pymalloc it usually does not segfault. But that doesn't mean it is working properly. At a minimum, the single use restriction should be documented; if anyone is interested, they could look into adding any more recent fixes to expat and plugging remaining reuse holes. |
|||
msg143296 - (view) | Author: David H. Gutteridge (dhgutteridge) | Date: 2011-09-01 05:01 | |
Okay. I'd seen the earlier issue, but had submitted this separately because I wasn't sure if it was a security-related bug, whereas the older issue didn't mention anything of the sort. (In retrospect, I could've just added to it...) |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:21 | admin | set | github: 57038 |
2011-09-01 05:01:49 | dhgutteridge | set | messages: + msg143296 |
2011-08-30 23:20:26 | ned.deily | set | status: open -> closed resolution: duplicate superseder: expat parser throws Memory Error when parsing multiple files messages: + msg143241 |
2011-08-30 16:39:52 | terry.reedy | set | nosy:
+ barry, ronaldoussoren, ned.deily messages: + msg143224 assignee: ronaldoussoren components: + macOS |
2011-08-30 16:35:15 | terry.reedy | set | messages: + msg143223 |
2011-08-30 04:58:45 | dhgutteridge | set | files:
+ pyexpat_crash_isolation_nb.py messages: + msg143200 versions: + Python 3.1, Python 3.2 |
2011-08-30 04:37:29 | dhgutteridge | set | messages: + msg143198 |
2011-08-30 04:22:38 | dhgutteridge | set | messages: + msg143197 |
2011-08-28 18:32:32 | terry.reedy | set | messages: + msg143116 |
2011-08-28 18:27:13 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg143115 |
2011-08-23 23:15:54 | dhgutteridge | set | messages: + msg142870 |
2011-08-23 23:10:55 | dhgutteridge | set | messages: + msg142869 |
2011-08-23 22:45:03 | dhgutteridge | create |