This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Mutlithread XML parsing cause segfault
Type: crash Stage: needs patch
Components: XML Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: alex, amaury.forgeotdarc, christian.heimes, iritkatriel, mrDoctorWho0.., pitrou
Priority: normal Keywords:

Created on 2013-05-13 13:52 by mrDoctorWho0.., last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
pyexpat_crash_multithread.py mrDoctorWho0.., 2013-05-13 13:52 pyexpat test
Messages (8)
msg189131 - (view) Author: mrDoctorWho0 . (mrDoctorWho0..) Date: 2013-05-13 13:52
Linux i386, Python 2.7.4. Multithread xml parsing via pyexpat cause segmentation fault
msg189159 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-05-13 18:28
Expat is not thread-safe at the object level, a single Parser cannot be used from multiple threads.
Pyexpat could add locks to Parser objects.
msg189161 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2013-05-13 18:32
It could also track tids and raise an error if you attempt to use it from multiple threads.
msg189162 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-05-13 18:35
But this would break working code which already uses locks correctly (or some kind of pool of cached parsers)
msg189225 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-05-14 14:58
In my opinion it's fine to document Python's XML parser as not thread-safe and leave locking to the user. Any fancy locking or tracking is going to make it slower for users. Any it takes a lot of effort to implement the feature, too. lxml offers a faster XML parser with multi-threading support.
msg189232 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-05-14 16:22
In my opinion it's not fine to let Python crash.
The implementation could be similar to the one in bufferedio.c, it's quite lightweight.
msg189533 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-18 16:20
I agree with Amaury, multi-threaded parsing should definitely not crash. Adding a lock should be quite easy. I wonder what would be the effect on performance, if there are lots of backs and forths between expat and Python.
msg401257 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-09-07 12:55
I've reproduced the segfault on 3.11.
History
Date User Action Args
2022-04-11 14:57:45adminsetgithub: 62170
2021-09-07 12:55:23iritkatrielsetnosy: + iritkatriel

messages: + msg401257
versions: + Python 3.9, Python 3.10, Python 3.11, - Python 2.7, Python 3.3, Python 3.4
2013-08-15 20:26:28pitrousettype: crash
2013-05-18 16:20:47pitrousetnosy: + pitrou
messages: + msg189533
2013-05-14 16:22:17amaury.forgeotdarcsetmessages: + msg189232
2013-05-14 14:58:13christian.heimessetnosy: + christian.heimes
messages: + msg189225
2013-05-13 18:45:50pitrousetversions: + Python 3.3, Python 3.4
2013-05-13 18:35:58amaury.forgeotdarcsetmessages: + msg189162
2013-05-13 18:32:54alexsetnosy: + alex
messages: + msg189161
2013-05-13 18:28:02amaury.forgeotdarcsetnosy: + amaury.forgeotdarc

messages: + msg189159
stage: needs patch
2013-05-13 13:52:38mrDoctorWho0..create