classification
Title: Python script generating Segmentation Fault
Type: crash Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: Daugeras, eric.smith, ned.deily, terry.reedy
Priority: normal Keywords:

Created on 2018-12-27 15:28 by Daugeras, last changed 2018-12-29 03:48 by terry.reedy. This issue is now closed.

Files
File name Uploaded Description Edit
SegfaultMinBugReplication.py Daugeras, 2018-12-27 21:02
Messages (7)
msg332592 - (view) Author: Daugeras (Daugeras) Date: 2018-12-27 15:28
Python script generates segmentation fault I cannot find the source of the problem. How is it to debug a segfault simply in Python ? Are there recommended coding practices to avoid Segmentation Faults ?

I wrote a script (1600 lines) to systematically download CSV files from a source and format the collected data. The script works very well for 100-200 files, but it systematically crashes with a segmentation fault message after a while.
-Crash always happens at the same spot in the script, with no understandable cause -I run it on Mac OSX but the crash also happens on Ubuntu Linux and Debian 9 -If I run the Pandas Routines that crash during my script on single files, they work properly. The crash only happens when I loop the script 100-200 times. -I checked every variable content, constructors (init) and they seem to be fine.

Code is too long to be pasted, but available on demand

Expected result should be execution to the end. Instead, it crashes after 100-200 iterations
msg332604 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-12-27 17:40
Sorry, but without either a full traceback or code that reproduces the problem, it is impossible for us to make an intelligent guess what problem you are seeing much less suggestion a solution.
msg332606 - (view) Author: Daugeras (Daugeras) Date: 2018-12-27 18:28
@Ned: Of course I understand your feed-back. I can provide a script to reproduce the crash. How can I do this ?
msg332611 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-12-27 20:03
You can either paste or upload a file (click on the "Choose File" button above in the web page) with code or with crash info.  For crash info, depending on the platform, there may be a Python traceback displayed in the shell session and there may be some sort of system-generated crash report, for example, on macOS, you might find them in ~/Library/Logs.  You should also include information about the platform (os, os version) and the Python version in use.  With recent versions of Python and if tests are installed, you can get all the configuration information with one command, for example:

python3.7 -m test.pythoninfo
msg332619 - (view) Author: Daugeras (Daugeras) Date: 2018-12-27 21:02
File Attached. To replicate the bug, you have to create directories for files to load (CF config data at the start of the script). The bug happens after 100-200 files are downloaded.
msg332655 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-12-28 13:33
Thanks for the report, but that example is so large and complicated that it's difficult for someone not familiar with it to understand what's going on.

If you could simplify it down to the smallest example that duplicates the problem, then perhaps we could make more progress.

Short of that, one thing you should look at is Victor's faulthander work: https://docs.python.org/3/library/faulthandler.html . It might give enough information to diagnose the problem.
msg332698 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2018-12-29 03:48
The script has 14 imports from 10 external packages, perhaps half of which include C code.  In such cases, the crash is nearly always in the external package, and Daugeras has already identified a pandas routine.

Daugeras, you can re-open if you gain evidence that the problem is in the cpython code we are responsible for.  But you should start by stripping out as much as you can and if there are crashes in pandas, submit a report to its authors.
History
Date User Action Args
2018-12-29 03:48:45terry.reedysetstatus: open -> closed

nosy: + terry.reedy
messages: + msg332698

resolution: third party
stage: resolved
2018-12-28 13:33:21eric.smithsetnosy: + eric.smith
messages: + msg332655
2018-12-27 21:02:16Daugerassetfiles: + SegfaultMinBugReplication.py

messages: + msg332619
2018-12-27 20:03:08ned.deilysetmessages: + msg332611
2018-12-27 18:28:20Daugerassetmessages: + msg332606
2018-12-27 17:40:47ned.deilysetnosy: + ned.deily
messages: + msg332604
2018-12-27 15:28:33Daugerascreate