Issue43802
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-04-10 16:17 by jacobtylerwalls, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (4) | |||
---|---|---|---|
msg390721 - (view) | Author: Jacob Walls (jacobtylerwalls) * | Date: 2021-04-10 16:17 | |
macOS 10.13.6 Python 3.9.2 I can consistently reproduce a seg fault while using multiprocessing.JoinableQueue in Python 3.9.2. My use case is the sheet music processing library music21. My fork includes a folder of 209 files I use to reproduce, running 3 cores, shown in the script below. (This is a subset of the over 1,000 files found here: https://github.com/MarkGotham/When-in-Rome/tree/master/Corpus/OpenScore-LiederCorpus Using this set of 1,000 files reproduces nearly every time; using the 209 files I committed to my fork was enough to reproduce about 75% of the time.) I'm a contributor to music21, so if this is an overwhelming amount of information to debug, I will gladly pare this down as much as I can or create some methods to access the multiprocessing functionality more directly. Many thanks for any assistance. pip3 install git+https://github.com/jacobtylerwalls/music21.git@bpo-investigation from music21 import corpus # suggest using a unique name each attempt lc = corpus.corpora.LocalCorpus(name='bpo-investigation') # point to the directory of files I committed to my fork for this investigation lc.addPath('/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/music21/bpo-files') # parse the files using multiprocessing # calls music21.metadata.bundles.MetadataBundle.addFromPaths() # which calls music21.metadata.caching.process_parallel() lc.save() # CTRL-C to recover from seg fault # then, wipe out the entries in .music21rc so that you can cleanly reproduce again from music21 import environment us = environment.UserSettings() us['localCorporaSettings'] = {} quit() Process: Python [31677] Path: /Library/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: 3.9.2 (3.9.2) Code Type: X86-64 (Native) Parent Process: Python [31674] Responsible: Python [31677] User ID: 501 Date/Time: 2021-04-10 11:21:19.294 -0400 OS Version: Mac OS X 10.13.6 (17G14042) Report Version: 12 Anonymous UUID: E7B0208A-19D6-ABDF-B3EA-3910A56B3E72 Sleep/Wake UUID: C4B83F57-6AD1-469E-82AE-88214FAA6283 Time Awake Since Boot: 140000 seconds Time Since Wake: 5900 seconds System Integrity Protection: enabled Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000100b3acd8 Exception Note: EXC_CORPSE_NOTIFY Termination Signal: Segmentation fault: 11 Termination Reason: Namespace SIGNAL, Code 0xb Terminating Process: exc handler [0] VM Regions Near 0x100b3acd8: --> __TEXT 00000001068bb000-00000001068bc000 [ 4K] r-x/rwx SM=COW [/Library/Frameworks/Python.framework/Versions/3.9/Resources/Python.app/Contents/MacOS/Python] Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 org.python.python 0x0000000106944072 PyObject_RichCompare + 258 1 org.python.python 0x0000000106943e9b PyObject_RichCompareBool + 43 2 org.python.python 0x00000001069ce3c0 min_max + 624 3 org.python.python 0x0000000106940bab cfunction_call + 59 4 org.python.python 0x0000000106901cad _PyObject_MakeTpCall + 365 5 org.python.python 0x00000001069d865c call_function + 876 6 org.python.python 0x00000001069d5b8b _PyEval_EvalFrameDefault + 25371 7 org.python.python 0x0000000106902478 function_code_fastcall + 104 8 org.python.python 0x00000001069d85cc call_function + 732 9 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 10 org.python.python 0x00000001069d92c3 _PyEval_EvalCode + 2611 11 org.python.python 0x0000000106902401 _PyFunction_Vectorcall + 289 12 org.python.python 0x00000001069d85cc call_function + 732 13 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 14 org.python.python 0x00000001069d92c3 _PyEval_EvalCode + 2611 15 org.python.python 0x0000000106902401 _PyFunction_Vectorcall + 289 16 org.python.python 0x0000000106901b05 _PyObject_FastCallDictTstate + 293 17 org.python.python 0x00000001069026e8 _PyObject_Call_Prepend + 152 18 org.python.python 0x000000010695be85 slot_tp_init + 165 19 org.python.python 0x00000001069573d9 type_call + 345 20 org.python.python 0x0000000106901cad _PyObject_MakeTpCall + 365 21 org.python.python 0x00000001069d865c call_function + 876 22 org.python.python 0x00000001069d5af3 _PyEval_EvalFrameDefault + 25219 23 org.python.python 0x0000000106902478 function_code_fastcall + 104 24 org.python.python 0x00000001069044ba method_vectorcall + 202 25 org.python.python 0x00000001069d85cc call_function + 732 26 org.python.python 0x00000001069d5af3 _PyEval_EvalFrameDefault + 25219 27 org.python.python 0x0000000106902478 function_code_fastcall + 104 28 org.python.python 0x00000001069d85cc call_function + 732 29 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 30 org.python.python 0x0000000106902478 function_code_fastcall + 104 31 org.python.python 0x00000001069d85cc call_function + 732 32 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 33 org.python.python 0x0000000106902478 function_code_fastcall + 104 34 org.python.python 0x00000001069d85cc call_function + 732 35 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 36 org.python.python 0x00000001069d92c3 _PyEval_EvalCode + 2611 37 org.python.python 0x0000000106902401 _PyFunction_Vectorcall + 289 38 org.python.python 0x00000001069d85cc call_function + 732 39 org.python.python 0x00000001069d5ad2 _PyEval_EvalFrameDefault + 25186 40 org.python.python 0x0000000106902478 function_code_fastcall + 104 41 org.python.python 0x00000001069d85cc call_function + 732 42 org.python.python 0x00000001069d5b8b _PyEval_EvalFrameDefault + 25371 43 org.python.python 0x00000001069d92c3 _PyEval_EvalCode + 2611 44 org.python.python 0x0000000106902401 _PyFunction_Vectorcall + 289 45 org.python.python 0x00000001069d85cc call_function + 732 46 org.python.python 0x00000001069d5c21 _PyEval_EvalFrameDefault + 25521 47 org.python.python 0x00000001069d92c3 _PyEval_EvalCode + 2611 48 org.python.python 0x00000001069cf74b PyEval_EvalCode + 139 49 org.python.python 0x0000000106a21fc4 PyRun_StringFlags + 356 50 org.python.python 0x0000000106a21e15 PyRun_SimpleStringFlags + 69 51 org.python.python 0x0000000106a3e367 Py_RunMain + 1047 52 org.python.python 0x0000000106a3eaef pymain_main + 223 53 org.python.python 0x0000000106a3eceb Py_BytesMain + 43 54 libdyld.dylib 0x00007fff5a148015 start + 1 |
|||
msg390753 - (view) | Author: Ned Deily (ned.deily) * | Date: 2021-04-10 22:45 | |
Thanks for providing a detailed and relatively simple-to-run test case for such a complicated failure. Not totally surprising for what appears to likely be a race condition, I have been unable to reproduce it under several macOS environments including in a 10.13.6 VM with multiple cores and under 11.2.3. I'm not sure if this would be expected to affect the results but I did receive multiple "WARNING: Could not import wedge: Error in getting DynamicWedges" messages when running the test case. Doing a quick exam of the installed set up, it appears that there is no attempt to use multiprocessing's "fork" method which is known to be problematic on macOS so that's a plus. And there don't appear to be any extension modules so the test case is pure Python, eliminating other likely suspects. One question that does come to mind is exactly which version of Python 3.9.2 you are testing with; can you provide the results of: /path/to/python3.9 -c 'import sys;print(sys.version)' ? Searching bugs.python.org, I see a few open issues with segfaults in PyObject_RichCompare but nothing that leaps out as being obviously similar. If it were possible to reproduce the segfault in other environments, like with 3.9.4 or on newer versions of macOS or on a current Linux platform, that would help to confirm the issue. Even better would be to be able to reproduce the issue while running a current Python 3.9 built with --with-pydebug on; unfortunately, we don't normally provide pre-built debug binaries on python.org. And, of course, running in debug mode could affect the rece condition, if that is indeed the issue. |
|||
msg390765 - (view) | Author: Jacob Walls (jacobtylerwalls) * | Date: 2021-04-11 03:38 | |
Thanks for this detailed reply. I reproduced on Python 3.9.4 on the same iMac from my original report running macOS 10.13.6, but with much lesser frequency (I wouldn't use the word "consistently" anymore). I tried on a MacBook Pro with worn-out hardware running a newer OS (10.15.4) and could not reproduce the issue there. I also built cPython (Python 3.10.0a7+ (heads/master:ac05f82ad4, Apr 10 2021, 20:16:36) [Clang 10.0.0 (clang-1000.10.44.4)] on darwin) using --with-pydebug and ran the test case a few times on the good-hardware iMac, and observed the file parsing (predictably) slow to a crawl, but no reproduction of the segfault. This leads me to believe that, yes, this is a race condition I'm encountering on fast hardware. Possibly related to issue-25769, since music21 makes heavy use of weakrefs and since music21.metadata.caching.MetadataCachingJob.run() calls gc.collect(). Perhaps I can look into engineering a minimal test case based on that discussion, involving a deliberately expensive __eq__() call. To answer your original question: my first report on 3.9.2 was on this specific version: 3.9.2 (v3.9.2:1a79785e3e, Feb 19 2021, 09:06:10) \n[Clang 6.0 (clang-600.0.57)] |
|||
msg391345 - (view) | Author: Jacob Walls (jacobtylerwalls) * | Date: 2021-04-18 20:45 | |
Unfortunately, at the outset I should have tested this without multiprocessing. I can reproduce without multiprocessing[1], which meant I could more easily pinpoint the failure. There is an expensive O(nm) algorithm[2] in the music21 library that is overflowing. I appreciate your time looking into this. Closing. Regards, Jacob [1] in the provided script, after one call to lc.save() call lc.rebuildMetadataCache(useMultiprocessing=False) [2] music21.analysis.discrete.Ambitus.getPitchRanges(), and I plan to do something about it. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:44 | admin | set | github: 87968 |
2021-04-18 20:45:25 | jacobtylerwalls | set | status: open -> closed resolution: not a bug messages: + msg391345 stage: resolved |
2021-04-11 03:38:25 | jacobtylerwalls | set | messages: + msg390765 |
2021-04-10 22:45:59 | ned.deily | set | nosy:
+ pitrou, davin messages: + msg390753 |
2021-04-10 16:17:02 | jacobtylerwalls | create |