This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Process finished with exit code -1073741819 (0xC0000005) when trying to access data from a pickled file
Type: crash Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: furkanonder, gregory.p.smith, mapf
Priority: normal Keywords:

Created on 2020-01-22 16:05 by mapf, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
temp.py mapf, 2020-04-22 10:53
Messages (6)
msg360481 - (view) Author: mapf (mapf) Date: 2020-01-22 16:05
I have a program where I create some relatively nested data and within the same session, I have no issues accessing the data. I then use picke.dump() with pickle.HIGHEST_PROTOCOL to save the data so I can access it in a later session.
These files are usually over 2GB large since they contain many images in the form of numpy arrays and I have never had any issues loading them. However there is one data structure that is a structured numpy array of type "a" with currently 16 different dtypes and they can all be accessed in the same session where they were created without any problems sometimes even after dumping and loading the data again. They can also all be accessed after they have been loaded in a different session with the exeption of one field.
This field contains rather nested data which is why I thought that this might be the issue, but I have honestly no idea. 
Each entry in this field is a list of len 20, whose entries are either None or a 1-d slice of "()"-shape from another structured array of type "b". This slice in turn has 37 different dtypes, most of which are either int, fload or bool. But there is one entry which is a list that can contain several dicts. The entries of this dict are floats, however one can be a slice of type "b" again, so there is some cross-referencing going on. As a test I already removed this entry though and it still crashed. 
My point is, the data that is stored is not of some crazy custom type. All the data is either of type bool, int, fload, list, dict or numpy.array. As I said, ALL the other stored data can be accessed without any problems. It is only this one field that can only be accessed during the same session it was created. 
My program runs using a PyQt5 GUI and I use PyCharm as the editor. I have already read that in the past, these two in combination seem to cause this error rather frequently maybe that has something to do with it. 
I have already tried reinstalling my Python distribution as well as PyCharm as well as running the code on a different machine to no avail. 
I am also pretty certain that this used to work just last week ago. I didn't change my code but now it doesn't work anymore.


Relevant specs:

Windows 10 Home 64 bit
PyCharm 2019.3.1 Professional
Python 3.7.4 via Anaconda
Numpy 1.16.5
PyQt 5.9.2
msg360483 - (view) Author: mapf (mapf) Date: 2020-01-22 16:08
I forgot to mention that sometimes, when I dump and load the data in the same session and try to access / use the data in question, I get the following Error:

"Fatal Python error: GC object already tracked"
msg366957 - (view) Author: Furkan Onder (furkanonder) * Date: 2020-04-22 00:02
Hi,
Can you share the program codes to better understand the problem?
msg366992 - (view) Author: mapf (mapf) Date: 2020-04-22 08:04
Hi, thanks for your interest! Since this was quite some time ago now, I eventually found a workaround (I think I made dicts out of the 1d slices and saved them instead) and the project moved on. I don't have the code from back then anymore, I'm sorry. But I can try to recreate it. I'm not sure if I will succeed though.
msg366995 - (view) Author: mapf (mapf) Date: 2020-04-22 10:53
Ok, I created a little something. It's not very pretty, but it works for me, meaning it causes the process to finish with exit code -1073741819 (0xC0000005).
msg404150 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-10-18 00:24
As your problem involves numpy and PyQt, both of which are very complicated third party extension module code, chances are there is a bug within those that is leading to memory corruption.
History
Date User Action Args
2022-04-11 14:59:25adminsetgithub: 83604
2021-10-18 00:24:01gregory.p.smithsetstatus: open -> closed

nosy: + gregory.p.smith
messages: + msg404150

resolution: third party
stage: resolved
2020-04-22 10:53:23mapfsetfiles: + temp.py

messages: + msg366995
2020-04-22 08:04:03mapfsetmessages: + msg366992
2020-04-22 00:02:28furkanondersetnosy: + furkanonder
messages: + msg366957
2020-01-22 16:08:46mapfsetmessages: + msg360483
2020-01-22 16:05:55mapfcreate