This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Py2.x int free list can grow without bounds
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: barry, gvanrossum, python-dev, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-06-19 06:31 by scoder, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
int-tp_free-2.7.patch serhiy.storchaka, 2016-11-28 21:15 review
Messages (9)
msg245492 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015-06-19 06:31
A Cython user noticed a memory leak when C-inheriting from "int".

http://thread.gmane.org/gmane.comp.python.cython.devel/15689

The Cython code to reproduce this is simply this:

    cdef class ExtendedInt(int): pass
 
    for j in xrange(10000000):
        ExtendedInt(j)

The problem is due to the free-list of the int type. It uses this code for deallocation:

"""
static void
int_dealloc(PyIntObject *v)
{
    if (PyInt_CheckExact(v)) {
        Py_TYPE(v) = (struct _typeobject *)free_list;
        free_list = v;
    }
    else
        Py_TYPE(v)->tp_free((PyObject *)v);
}

static void
int_free(PyIntObject *v)
{
    Py_TYPE(v) = (struct _typeobject *)free_list;
    free_list = v;
}
"""

Now, when C-inheriting from PyInt_Type without providing an own tp_free implementation, PyType_Ready() will inherit the supertype's tp_free slot, which means that int_dealloc() above will end up calling int_free() in all cases, not only for the exact-int case. Thus, whether or not it's exactly "int" or a subtype, the object will always be added to the free-list on deallocation.

However, in the subtype case, the free-list is actually ignored by int_new() and int_subtype_new(), so that as long as the user code only creates tons of int subtype instances and not plain int instances, the freelist will keep growing without bounds by pushing dead subtype objects onto it that are never reused.

There are two problems here:

1) int subtypes should not be added to the freelist, or at least not unless they have exactly the same struct size as PyIntObject (which the ExtendedInt type above does but other subtypes may not)

2) if a free-list is used, it should be used in all instantiation cases, not just in PyInt_FromLong().

Fixing 1) by adding a type check to int_free() would be enough to fix the overall problem. Here's a quickly hacked up change that seems to work for me:

"""
static void
int_free(PyIntObject *v)
{
    if (PyInt_CheckExact(v)) {
        Py_TYPE(v) = (struct _typeobject *)free_list;
        free_list = v;
    }
    else if (Py_TYPE(v)->tp_flags & Py_TPFLAGS_HAVE_GC)
        PyObject_GC_Del(v);  // untested by probably necessary
    else
        PyObject_Del(v);
}
"""
msg245503 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-06-19 13:38
Now that's weird. Why is the current int_free() doing the same as int_dealloc()? Because some people call tp_free directly?
msg245504 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-06-19 13:42
Apparently, this is because of 200559fcc664. The fix is weird, though: why duplicate code instead of moving it into tp_free?

I'd rather let Guido and Barry discuss this, I'm not willing to touch the 2.x int type anymore.
msg245505 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-06-19 14:16
1) The intended solution is to require that int subclasses override tp_free.

2) I don't see any constructors that don't call PyInt_FromLong() -- what am I missing?
msg245508 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015-06-19 16:56
> 1) The intended solution is to require that int subclasses override tp_free.

In the way I showed? By calling either PyObject_GC_Del() or PyObject_Del()
directly? I'll happily do that (Py2.7 and Py2 "int" being dead ends makes
that pretty future proof), but it's neither obvious nor very clean.

> 2) I don't see any constructors that don't call PyInt_FromLong() -- what am I missing?

Not sure what you mean. int_new() immediately calls int_subtype_new(),
which then calls type->tp_alloc(). No call to PyInt_FromLong() involved.

"""
static PyObject *
int_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *x = NULL;
    int base = -909;
    static char *kwlist[] = {"x", "base", 0};

    if (type != &PyInt_Type)
        return int_subtype_new(type, args, kwds); /* Wimp out */
...

static PyObject *
int_subtype_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    ...
    tmp = int_new(&PyInt_Type, args, kwds);
    if (tmp == NULL)
        return NULL;
    if (!PyInt_Check(tmp)) {
        ival = PyLong_AsLong(tmp);
        if (ival == -1 && PyErr_Occurred()) {
            Py_DECREF(tmp);
            return NULL;
        }
    } else {
        ival = ((PyIntObject *)tmp)->ob_ival;
    }

    newobj = type->tp_alloc(type, 0);
    if (newobj == NULL) {
        Py_DECREF(tmp);
        return NULL;
    }
    ((PyIntObject *)newobj)->ob_ival = ival;
    Py_DECREF(tmp);
    return newobj;
}
"""
msg245513 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-06-19 19:12
(1) Just look at the examples of other builtin types with a similar structure but no free_list.

(2) I guess the intention is for classes that subclass int to also override tp_alloc.

Note that much of this code is written pretty much assuming that subclasses are created using class statements (in a regular Python module, not Cython) which take care of all these details. That doesn't mean Cython is wrong to try this, but it does mean there isn't a lot of documentation, and it also means I don't think the thing you reported qualifies as a bug in CPython.
msg281916 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-28 21:15
It seems to me, that all builtin and extension types set tp_free to PyObject_Del, PyObject_GC_Del, or 0. The int class is the only exception.

int_free() was introduced in 200559fcc664:

> Make sure that tp_free frees the int the same way as tp_dealloc would.
> This fixes the problem that Barry reported on python-dev:
>    >>> 23000 .__class__ = bool
> crashes in the deallocator.  This was because int inherited tp_free
> from object, which uses the default allocator.

The above example no longer works:

>>> 23000 .__class__ = bool
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __class__ assignment: only for heap types

I think int_free should be removed and tp_free should be reverted to 0 (as in float type).
msg281917 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-11-28 21:34
OK, SGTM.
msg282029 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-11-29 18:49
New changeset fd0842f34602 by Serhiy Storchaka in branch '2.7':
Issue #24469: Fixed memory leak caused by int subclasses without overridden
https://hg.python.org/cpython/rev/fd0842f34602
History
Date User Action Args
2022-04-11 14:58:18adminsetgithub: 68657
2016-11-29 18:50:20serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016-11-29 18:49:37python-devsetnosy: + python-dev
messages: + msg282029
2016-11-28 21:34:08gvanrossumsetmessages: + msg281917
2016-11-28 21:15:12serhiy.storchakasetfiles: + int-tp_free-2.7.patch
messages: + msg281916

assignee: serhiy.storchaka
keywords: + patch
stage: patch review
2015-06-19 19:12:52gvanrossumsetmessages: + msg245513
2015-06-19 16:56:33scodersetmessages: + msg245508
2015-06-19 14:16:37gvanrossumsetmessages: + msg245505
2015-06-19 13:42:53pitrousetnosy: - pitrou
2015-06-19 13:42:49pitrousetnosy: + gvanrossum, barry
messages: + msg245504
2015-06-19 13:38:26pitrousetmessages: + msg245503
2015-06-19 09:42:18scodersetnosy: + pitrou, serhiy.storchaka
2015-06-19 06:31:18scodercreate