classification
Title: segmentation fault appeared in python 3.10.0b2
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: erlendaasland, miss-islington, pablogsal, terry.reedy, vstinner, zzzeek
Priority: release blocker Keywords: patch

Created on 2021-06-03 18:23 by zzzeek, last changed 2021-06-05 23:13 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 26545 merged pablogsal, 2021-06-05 01:35
PR 26547 merged miss-islington, 2021-06-05 02:50
PR 26551 merged pablogsal, 2021-06-05 20:09
PR 26552 merged miss-islington, 2021-06-05 22:41
Messages (16)
msg395028 - (view) Author: mike bayer (zzzeek) * Date: 2021-06-03 18:23
segmentation fault related to object deallocation and traceback objects, is extremely difficult to reproduce and definitely appeared as of 3.10.0b2, does not occur in 3.10.0b1.   linux and osx platforms are affected.

The issue requires "greenlet==1.1.0" to be installed, as well as that for me to reproduce it I have to use sqlite3 with some special APIs also, so while this issue might be in greenlet, or sqlite3, I have a feeling these are all factors that are combining together in a very random way to reveal something that's actually happening in the interpreter, but in any case would be great if core devs can see why it happens.  The below script has been tested on various linux platforms, and the overall bug was revealed by an interaction in the SQLAlchemy test suite that's occurring in many places including OSX, linux.   As noted, a whole bunch of very random things are needed for me to reproduce it.


import greenlet
import sqlite3


class _ErrorContainer(object):
    error = None


def _expect_raises_fn(fn):
    ec = _ErrorContainer()
    try:
        fn()
    except Exception as err:
        assert str(err) == "this is a test"

        # assign the exception context outside of the except
        # is necessary
        ec.error = err

    # don't del the exception context is necessary
#    del ec


def greenlet_spawn(fn, *args, **kwargs):

    # spawning a greenlet is necessary
    context = greenlet.greenlet(fn, greenlet.getcurrent())

    # assignment to "result" is necessary
    result = context.switch(*args, **kwargs)

    # raising exception is necessary
    raise Exception("this is a test")


class OuterConnectionWrapper:
    def __init__(self, connection):
        self.connection = connection

    def go(self, stmt):
        sqlite_connection = self.connection
        cursor = sqlite_connection.cursor()
        cursor.execute("select 1")
        return cursor

    def execute(self, stmt):
        return greenlet_spawn(self.go, stmt)

    def _do_close(self):
        self.connection.close()
        self.connection = None

    def close(self):
        self._do_close()


class InnerConnectionWrapper:
    def __init__(self, connection):
        self.connection = connection

    def create_function(self, *arg, **kw):
        self.connection.create_function(*arg, **kw)

    def cursor(self):
        return self.connection.cursor()

    def close(self):
        self.connection = None


class ConnectionPool:
    def __init__(self):
        self.conn = sqlite3.connect(":memory:")

        def regexp(a, b):
            return None

        self.conn.create_function("regexp", 2, regexp)

    def connect(self):
        return InnerConnectionWrapper(self.conn)


def do_test():
    pool = ConnectionPool()

    def go():
        c1 = pool.connect()
        conn = OuterConnectionWrapper(c1)
        try:
            conn.execute("test")
        finally:
            conn.close()

    _expect_raises_fn(go)


do_test()
msg395029 - (view) Author: mike bayer (zzzeek) * Date: 2021-06-03 18:25
if the issue is in greenlet this can be bounced back to https://github.com/python-greenlet/greenlet/issues/242
msg395121 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-06-04 21:13
Coredevs are likely to assume that the problem is in greenlets, especially if it has any C code (I don't know).  I suggest you first try to find the commit between releases that resulted in the problem.  If it is not deterministicly reproducible, you likely need to do a manual binary search.  (This is as much as I know.)

If you get an OS 'traceback' longer than a dozen line or so, please attach it as a file rather than pasting in a message.
msg395128 - (view) Author: mike bayer (zzzeek) * Date: 2021-06-04 22:07
yes, if I have time I will begin to undertake that, wanted to put it up here in case anyone has git bisect on speed dial for cpython.
msg395145 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 00:51
I did the bisection:

ff359d735f1a60878975d1c5751bfd2361e84067 is the first bad commit
commit ff359d735f1a60878975d1c5751bfd2361e84067
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Mon May 31 02:12:27 2021 -0700

    bpo-42972: Fix sqlite3 traverse/clear functions (GH-26452) (GH-26461)

    (cherry picked from commit d1124b09e8251061dc040cbd396f35ae57783f4a)

    Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>

:040000 040000 701f24eacbdf2b52b612dc33ae9a7dcf24a73826 100af9271a7e7c038a860388587b1cb096b8494f M      Modules
bisect run success
msg395146 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 01:36
The problem is that this is wrong:

https://github.com/python/cpython/blob/f171877ebe276749f31386baed5841ce37cbee2e/Modules/_sqlite/statement.c#L411-L423

1) tp_clear should *not* do anything other than cleaning refs to python objects
2) sqlite3_finalize is called without the GIL but that can trigger a call to the destructor of the session (_destructor()) that executes Python code
msg395147 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 01:39
Using the reproducer code by Mike, without this fix:

❯ ./python ../lel.py
[1]    25344 segmentation fault  ./python ../lel.py

With this fix:

❯ ./python ../problem.py
❯ echo $?
0
msg395148 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 02:09
This is a simpler reproducer:

        dest = sqlite.connect(':memory:')
        def md5sum(t):
            return

        dest.create_function("md5", 1, md5sum)
        x = dest("create table lang (name, first_appeared)")
        del md5sum, dest

        y = [x]
        y.append(y)

        del x,y
        gc.collect()
msg395150 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 02:50
New changeset fa106a685c1f199aca5be5c2d0277a14cc9057bd by Pablo Galindo in branch 'main':
bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects (GH-26545)
https://github.com/python/cpython/commit/fa106a685c1f199aca5be5c2d0277a14cc9057bd
msg395151 - (view) Author: miss-islington (miss-islington) Date: 2021-06-05 03:09
New changeset ad2f3b74b5615aa36a82d1fdbc45bb7468aa1d72 by Miss Islington (bot) in branch '3.10':
bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects (GH-26545)
https://github.com/python/cpython/commit/ad2f3b74b5615aa36a82d1fdbc45bb7468aa1d72
msg395155 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-06-05 05:48
Thanks Mike for the report and reproducer, and Pablo for the fix!
msg395162 - (view) Author: mike bayer (zzzeek) * Date: 2021-06-05 13:15
great news!   

Based on how many random factors were needed to reproduce as well as that it seemed to be gc related and appeared very suddenly, I had an intuition this was on the cpython side, thanks so much for doing this Pablo!
msg395175 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-06-05 17:53
sqlit3.Cursor is prone to the same bug. Do you want me to create a new issue for it, or can I reuse this issue number, Pablo? I'll see if I can create a reproducer for that as well.
msg395181 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 19:43
> sqlit3.Cursor is prone to the same bug.

No, is not: it doesn't drop the GIL in tp_clear. There is technically no bug, but is true that tp_clear should only clean Python references. Although in this case there are some sematics with the cleanup that are not clear.
msg395186 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 22:41
New changeset 6e3b7cf3af3ed7758b2c2193c1d393feb8ab8f72 by Pablo Galindo in branch 'main':
bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held (GH-26551)
https://github.com/python/cpython/commit/6e3b7cf3af3ed7758b2c2193c1d393feb8ab8f72
msg395187 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 23:13
New changeset 317e9ed4363a86b1364573c5a5e30011a080ce6d by Miss Islington (bot) in branch '3.10':
bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held (GH-26551) (GH_26552)
https://github.com/python/cpython/commit/317e9ed4363a86b1364573c5a5e30011a080ce6d
History
Date User Action Args
2021-06-05 23:13:31pablogsalsetmessages: + msg395187
2021-06-05 22:41:20miss-islingtonsetpull_requests: + pull_request25139
2021-06-05 22:41:20pablogsalsetmessages: + msg395186
2021-06-05 20:09:10pablogsalsetpull_requests: + pull_request25138
2021-06-05 19:43:20pablogsalsetmessages: + msg395181
2021-06-05 17:53:12erlendaaslandsetmessages: + msg395175
2021-06-05 13:15:01zzzeeksetmessages: + msg395162
2021-06-05 05:48:27erlendaaslandsetmessages: + msg395155
2021-06-05 03:24:53pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-06-05 03:09:47miss-islingtonsetmessages: + msg395151
2021-06-05 02:50:47miss-islingtonsetnosy: + miss-islington

pull_requests: + pull_request25136
stage: patch review
2021-06-05 02:50:46pablogsalsetmessages: + msg395150
2021-06-05 02:09:13pablogsalsetmessages: + msg395148
2021-06-05 01:39:23pablogsalsetmessages: + msg395147
2021-06-05 01:36:39pablogsalsetmessages: + msg395146
stage: patch review -> (no value)
2021-06-05 01:35:47pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25135
2021-06-05 00:56:18pablogsalsetnosy: + vstinner
2021-06-05 00:52:24pablogsalsetpriority: normal -> release blocker
2021-06-05 00:51:59pablogsalsetnosy: + pablogsal
messages: + msg395145
2021-06-04 22:09:29erlendaaslandsetnosy: + erlendaasland
2021-06-04 22:07:17zzzeeksetmessages: + msg395128
2021-06-04 21:13:13terry.reedysettype: crash

messages: + msg395121
nosy: + terry.reedy
2021-06-03 18:25:35zzzeeksetmessages: + msg395029
2021-06-03 18:23:07zzzeekcreate