classification
Title: Segmentation fault on shutdown with shelve & c pickle
Type: crash Stage: patch review
Components: ctypes, Interpreter Core, Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: dorosch, zd nex
Priority: normal Keywords: patch

Created on 2020-02-18 09:51 by zd nex, last changed 2020-06-29 13:48 by zd nex.

Files
File name Uploaded Description Edit
test_crash_shelve.zip zd nex, 2020-02-18 09:51 files for repro crash
crash_pdb.txt zd nex, 2020-02-24 10:07
crash.txt zd nex, 2020-02-25 09:18
shelve.py.patch dorosch, 2020-02-25 09:53
Pull Requests
URL Status Linked Edit
PR 18655 open dorosch, 2020-02-25 10:11
Messages (15)
msg362186 - (view) Author: zd nex (zd nex) Date: 2020-02-18 09:51
Hello,

so I was transferring some our old code from Python2.7 to new and find that new version seems to crash quite a lot. After some finding (good thing faulthandler) I think I tracked it down to to Shelve.__del__ method > going to C Pickle module (not python one). Here it is crash itself. Attached zip has 3 file. When shelve.close is used it does not seem to crash every time. 

$python3.8 -X faulthandler ce_test_2.py
start
end
Fatal Python error: Segmentation fault

Current thread 0x00007fb22e299740 (most recent call first):
  File "/usr/lib/python3.8/shelve.py", line 124 in __setitem__
  File "/usr/lib/python3.8/shelve.py", line 168 in sync
  File "/usr/lib/python3.8/shelve.py", line 144 in close
  File "/usr/lib/python3.8/shelve.py", line 162 in __del__
Neoprávněný přístup do paměti (SIGSEGV)


Code for crash is here:


import shelve
import material
data = shelve.open("test3", flag="c",writeback=True)

def test_shelve(data):
    for k,v in data.items():
        pass

print("start")
test_shelve(data)

#data.close() #fixes SIGSEGV at shutdown
#actually problem is in c pickle module; when Python pickle module is used it works

print("end")
#after this it is crash



Code just loads module and shelve and opens file. Then in another function it cycles through data and that creates crash in C pickle module at shutdown. Weird thing is that when cycle through data is not in function it does not crash. Also crash can be avoided when C Pickle is traded for Python Pickle.


In REPL it is quite similar just list on shelve.items() and exit makes Python crash.

Python 3.8.1 (default, Dec 22 2019, 08:15:39) 
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shelve
>>> import material
>>> data = shelve.open("test3", flag="c",writeback=True)
>>> list(data.items())
[('H1615', Material(name='Třešeň Romana', code='H1615', vars=0))]
>>> exit()
Fatal Python error: Segmentation fault

Current thread 0x00007f14a2546740 (most recent call first):
  File "/usr/lib/python3.8/shelve.py", line 124 in __setitem__
  File "/usr/lib/python3.8/shelve.py", line 168 in sync
  File "/usr/lib/python3.8/shelve.py", line 144 in close
  File "/usr/lib/python3.8/shelve.py", line 162 in __del__
Neoprávněný přístup do paměti (SIGSEGV)

Hopefully you can fix this.
msg362257 - (view) Author: zd nex (zd nex) Date: 2020-02-19 06:44
So I was trying to figure out what is crash it self and it looks to me that it is related to import. Do you know how I can properly debug this crash?
msg362573 - (view) Author: Andrei Daraschenka (dorosch) * Date: 2020-02-24 09:26
Could you give more details for reproduce it because on the latest version cpython from master branch it's work
You can debug it with help pdb. Just set breakpoint:

...
test_shelve(data)
breakpoint()
data.close()
...

And try run it step-by-step (press 'S' to go to the next step and press 'll' to know where you are now)
https://docs.python.org/3/library/pdb.html#debugger-commands
msg362576 - (view) Author: zd nex (zd nex) Date: 2020-02-24 09:50
Hello,

well and in 3.8 it does not crash for you? Is there some devel build of 3.9 for ubuntu which I can try?

I have tested it on 3.7,3.8 and 3.6 and it crashed always when close was not present or when list was called in another function.
msg362577 - (view) Author: zd nex (zd nex) Date: 2020-02-24 09:52
Ok I will try pdb
msg362580 - (view) Author: Andrei Daraschenka (dorosch) * Date: 2020-02-24 09:58
Yes, It's work for me

$ uname -a
Linux laptop 5.3.0-40-generic #32~18.04.1-Ubuntu SMP Mon Feb 3 14:05:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ python3.8 --version
Python 3.8.0
$ python3.8 ce_test_2.py 
start
end

Could you please write about the results of the research with pdb.
msg362581 - (view) Author: zd nex (zd nex) Date: 2020-02-24 10:00
Okay I have tried to run it with breakpoint() but it does not crash on 3.8
msg362582 - (view) Author: zd nex (zd nex) Date: 2020-02-24 10:07
Okay I have managed to crash it when exit() was called and also I am attaching output with -v

python3.8 -X faulthandler ce_test_2.py 
start
end
--Return--
> /home/fractal/workspace/test_py_crash/ce_test_2.py(19)<module>()->None
-> breakpoint()
(Pdb) exit()
Traceback (most recent call last):
  File "ce_test_2.py", line 19, in <module>
    breakpoint()
  File "/usr/lib/python3.8/bdb.py", line 92, in trace_dispatch
    return self.dispatch_return(frame, arg)
  File "/usr/lib/python3.8/bdb.py", line 154, in dispatch_return
    if self.quitting: raise BdbQuit
bdb.BdbQuit
Fatal Python error: Segmentation fault

Current thread 0x00007f976975f740 (most recent call first):
  File "/usr/lib/python3.8/shelve.py", line 124 in __setitem__
  File "/usr/lib/python3.8/shelve.py", line 168 in sync
  File "/usr/lib/python3.8/shelve.py", line 144 in close
  File "/usr/lib/python3.8/shelve.py", line 162 in __del__
Neoprávněný přístup do paměti (SIGSEGV)
$ 


# destroy string
# cleanup[2] removing threading
# cleanup[2] removing atexit
# cleanup[2] removing logging
# cleanup[2] removing material
# cleanup[2] removing _dbm
# cleanup[2] removing dbm.ndbm
# cleanup[2] removing dbm
# destroy dbm
# cleanup[2] removing _gdbm
# cleanup[2] removing dbm.gnu
# cleanup[2] removing _ast
# cleanup[2] removing ast
# cleanup[2] removing dbm.dumb
# destroy _ast
Fatal Python error: Segmentation fault

Current thread 0x00007fd5477cb740 (most recent call first):
  File "/usr/lib/python3.8/shelve.py", line 124 in __setitem__
  File "/usr/lib/python3.8/shelve.py", line 168 in sync
  File "/usr/lib/python3.8/shelve.py", line 144 in close
  File "/usr/lib/python3.8/shelve.py", line 162 in __del__
Neoprávněný přístup do paměti (SIGSEGV)
msg362628 - (view) Author: zd nex (zd nex) Date: 2020-02-25 09:18
So I was trying it again in Python 3.6.9 and 3.8.1 directly in REPL. And it behaves same.  I have tried it on two different linux boxes (both 64bit) where I have diffrent versions. In both of them it crashes in same way .. destroy _ast and then it crashes and faulthandler again shows shelve (pickle)


So I am attaching new crash reports directly from REPL where I just call list(data.items()) and then exit()

Btw it seems to me that when PDB is active crash does not occurs until exit() is called.
msg362629 - (view) Author: Andrei Daraschenka (dorosch) * Date: 2020-02-25 09:53
Hello
I was finally reproduce your problem.
Probles was in Lib/shelve.py module in method Shelf.sync. When python is shut down in classes calls __exit__ methods, in our issues method __exti__ called method close() which called method sync(). Method sync() tried sync data between disk and memory storage. But if key didn't exists on disk - python has error segfault.
I attach path for this problem and I will prepare PR soon.
msg362797 - (view) Author: zd nex (zd nex) Date: 2020-02-27 13:21
Hi,

i was looking on failing tests on attached pull request and it seems to me that it intentionally should create new entry for saving.

Maybe that save should actually happen, but it should be fixed in different way? From my small tests I was thinking that problem was actually in pickle, but maybe it is connected to shelve itself.
msg364649 - (view) Author: zd nex (zd nex) Date: 2020-03-20 06:57
Hello,

so I was trying to figure out where actually is problem is. As I do not think it is in shelve itself. So I have made different method for __setitem__ on shelve and I have found that it is actually in pickle.dump >

Here is code which I have used

def __setitem__(self, key, value):
    if self.writeback:
        self.cache[key] = value
    f = BytesIO()
    print("set")
    p = pickle.Pickler(f, self._protocol)
    try: 
        print("before dumps - > crash",value)
        p.dump(value)
        print("after dump > will not be printed on crash")
    except Exception as e:
        print("error set",e)
    print("after dump",key)
    self.dict[key.encode(self.keyencoding)] = f.getvalue()
    print("saved")

When in this code user changes p.dump to another method cpython crash does not happen. Can you please try to see if it is like that?
msg364838 - (view) Author: zd nex (zd nex) Date: 2020-03-23 06:03
So I want to properly debug this? How I can debug that call dump() for pickle? It does not seem to be possible. I guess I need to make some custom build?
msg372574 - (view) Author: Andrei Daraschenka (dorosch) * Date: 2020-06-29 13:05
Hello zd nex

After a little research, it became clear that this drop is due to vague garbage collection behavior. As you know, method `__del__` is called by the garbage collector and when this happens there is no guarantee that the rest of the objects are not yet cleaned.

    def __del__(self):
        if not hasattr(self, 'writeback'):
            return
        self.close()

But in the `close` method, synchronization with the disk occurred and objects were created for modules that were no longer in memory and, as a result, method `dump` fell with an error, because when trying to get module `pickle` it was already gone (due to the garbage collector).

Modules/_pickle.c
....
4353     PickleState *st = _Pickle_GetGlobalState();
....

But `_Pickle_GetGlobalState` can't return right result because pickle module was gone from memory by garbage collector.

In this case, you encountered a problem when the C code tried to use a module that was no longer in memory since this module deleted the garbage collector.
msg372575 - (view) Author: zd nex (zd nex) Date: 2020-06-29 13:48
Hello, 
ok but it seems to me that this segfault happens always (it is not random)? So I guess that there should be way how fix C pickle, no? Or something else should be done with __del__ method of shelve. Because in Python2 this was normally working. It seems to me that when user just reads data of shelve and then exit happens this can happen.

Why Python2 normally worked?
History
Date User Action Args
2020-06-29 13:48:46zd nexsetmessages: + msg372575
2020-06-29 13:05:04doroschsetmessages: + msg372574
2020-03-24 09:37:12zd nexsettitle: SIGSEGV crash on shutdown with shelve & c pickle -> Segmentation fault on shutdown with shelve & c pickle
2020-03-23 06:03:39zd nexsetmessages: + msg364838
2020-03-20 06:57:26zd nexsetmessages: + msg364649
components: + Interpreter Core, ctypes
2020-02-27 13:21:11zd nexsetmessages: + msg362797
2020-02-25 10:11:12doroschsetstage: patch review
pull_requests: + pull_request18016
2020-02-25 09:53:49doroschsetfiles: + shelve.py.patch
keywords: + patch
messages: + msg362629
2020-02-25 09:18:28zd nexsetfiles: + crash.txt

messages: + msg362628
2020-02-24 10:07:42zd nexsetfiles: + crash_pdb.txt

messages: + msg362582
2020-02-24 10:00:06zd nexsetmessages: + msg362581
2020-02-24 09:58:40doroschsetmessages: + msg362580
2020-02-24 09:52:38zd nexsetmessages: + msg362577
2020-02-24 09:50:23zd nexsetmessages: + msg362576
2020-02-24 09:26:03doroschsetnosy: + dorosch
messages: + msg362573
2020-02-24 06:30:15zd nexsetversions: + Python 3.9
2020-02-19 06:44:08zd nexsetmessages: + msg362257
2020-02-18 09:51:02zd nexcreate