classification
Title: Crash in Windows with unknown cause
Type: crash Stage: resolved
Components: Documentation Versions: Python 2.7
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: docs@python Nosy List: amorilia, brian.curtin, davin, docs@python, pitrou, rlibiez, santoso.wijaya, terry.reedy, vstinner, zach.ware
Priority: normal Keywords:

Created on 2011-09-30 21:21 by rlibiez, last changed 2018-09-03 21:50 by zach.ware. This issue is now closed.

Files
File name Uploaded Description Edit
poolcrash.py amorilia, 2011-10-04 21:31 a script which *might* reproduce the crash (see comments below)
Messages (10)
msg144707 - (view) Author: Roger Libiez (rlibiez) Date: 2011-09-30 21:21
While using the application found at: https://sourceforge.net/tracker/?group_id=199269

The following Windows error dump generates itself while processing a batch of 3D mesh files with it. I do not know the specifics about what process was underway or have any Python trace data to supply, but it can be reproduced reliably using the beta6 version of the application. The developer directed me here after I filed this bug: https://sourceforge.net/tracker/?func=detail&aid=3415495&group_id=199269&atid=968813

The only known method I have to reproduce this is to run a batch process against a large number of mesh files. Windows memory usage is not anywhere close to the 2GB process limit.

Windows 7 Home Premium 64 bit, using 32 bit Python 2.7.2.

Problem signature:
Problem Event Name: APPCRASH
Application Name: python.exe
Application Version: 0.0.0.0
Application Timestamp: 4df4ba7c
Fault Module Name: python27.dll
Fault Module Version: 2.7.2150.1013
Fault Module Timestamp: 4df4ba7c
Exception Code: c0000005
Exception Offset: 0002a33f
OS Version: 6.1.7601.2.1.0.768.3
Locale ID: 1033
Additional Information 1: 0a9e
Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
Additional Information 3: 0a9e
Additional Information 4: 0a9e372d3b4ad19135b953a78882e789
msg144750 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-01 22:08
I suppose that the application uses extensions written in C and one on these extensions is buggy. Can you write a script to reproduce the bug without the application? If not, we cannot help you :-(

You may try the faulthandler to get more information:
https://github.com/haypo/faulthandler/wiki
msg144835 - (view) Author: Amorilia (amorilia) Date: 2011-10-03 18:57
I'm the author of the application. The tool is written in pure Python, and only uses libraries from stdlib.

It would be really nice to have a simple standalone script to reproduce the crash, however I am still trying to reproduce it myself. So far no success.
msg144837 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-10-03 19:01
I recently created "minidumper" to write Visual Studio "MiniDump" files of interpreter crashes, but it's currently only available on 3.x. If I port it to 2.x, you could add "import minidumper;minidumper.enable()" to the top of your script, then we could probably get somewhere with it.

An additional example script, possibly including sample data to run through it, would be even better.
msg144847 - (view) Author: Santoso Wijaya (santoso.wijaya) * Date: 2011-10-03 23:08
Without the aforementioned minidump library, you can also kick off the Python interpreter using a debugger (or have a debugger break into an already-running one) [1]. When the crash happens--presumably the debugger will break at this point--you can export the mini dump into a file for us to look at [2].

[1] I like using windbg (http://msdn.microsoft.com/en-us/windows/hardware/gg463009).

[2] It would be something like, `.dump /ma C:\path\to\crash.DMP`
msg144926 - (view) Author: Amorilia (amorilia) Date: 2011-10-04 21:31
Quick update: apparently, fixing another seemingly unrelated bug, fixed this crashing issue as well for rlibiez. Here's relevant the commit:

https://github.com/amorilia/pyffi/commit/bd7886eefedfce8fb108c4701cf0467e2a707907

Basically, the problem was with multiprocessing.Pools not getting closed and joined.

I'm attaching a script (poolcrash.py) which, theoretically, ought to reproduce the crash - although it doesn't quite reproduce it on my machine; I'm running out of memory and my machine just hangs desperately accessing the swap file before anything happens...

Beware that running the bugged script may force you perform a hard reboot of your system, particularly if you wait until all physical memory is used up by zombie processes.
msg144935 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-10-05 02:27
I tried that script on 2.7 and like it did for you, it just ran until my machine became unusable.

On 3.x I think I got a RuntimeError after a while, but I forgot exactly what happened since the machine ended up being hosed later from the 2.7 run. In any event, it certainly didn't crash there and only went a short time before erroring out with some exception.
msg144966 - (view) Author: Amorilia (amorilia) Date: 2011-10-05 18:15
Thanks for also trying it out, Brian.

I feel there's little more I can do. I guess the multiprocessing module could be documented a bit better that join() ought to be called before the pool is deleted? Currently, the docs merely say:

"Wait for the worker processes to exit. One must call close() or terminate() before using join()." (http://docs.python.org/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.join)

Something along the following lines could be added:

"You must call join() when you no longer need the pool; otherwise, zombie processes may keep running."

I'm happy to provide a patch, if needed.
msg145152 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-10-08 01:41
If I understand the messages, this is more a program bug than a Python bug and the 'fix' for this tracker would be a doc patch. Adding one would be welcome.
msg324540 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2018-09-03 21:50
As far as I can tell, this was an application bug in multiprocessing cleanup 7 years ago.  I'm not sure there's really even anything to add to the docs for this, but if anyone disagrees or produces a patch, please reopen.
History
Date User Action Args
2018-09-03 21:50:11zach.waresetstatus: open -> closed

nosy: + pitrou, davin, zach.ware
messages: + msg324540

resolution: works for me
stage: resolved
2014-07-11 14:40:23vstinnersetmessages: - msg222751
2014-07-11 14:39:59vstinnersetmessages: + msg222751
2014-07-10 17:52:07BreamoreBoysetassignee: docs@python

components: + Documentation, - Windows
nosy: + docs@python
2011-10-08 01:41:18terry.reedysetnosy: + terry.reedy
messages: + msg145152
2011-10-05 18:15:37amoriliasetmessages: + msg144966
2011-10-05 02:27:02brian.curtinsetmessages: + msg144935
2011-10-04 21:31:50amoriliasetfiles: + poolcrash.py

messages: + msg144926
2011-10-03 23:09:02santoso.wijayasettype: crash
2011-10-03 23:08:55santoso.wijayasetnosy: + santoso.wijaya
messages: + msg144847
2011-10-03 19:01:03brian.curtinsetnosy: + brian.curtin
messages: + msg144837
2011-10-03 18:57:13amoriliasetnosy: + amorilia
messages: + msg144835
2011-10-01 22:08:42vstinnersetnosy: + vstinner
messages: + msg144750
2011-09-30 21:21:04rlibiezcreate