Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hunt memory allocations in addition to reference leaks #57599

Closed
pitrou opened this issue Nov 12, 2011 · 31 comments
Closed

Hunt memory allocations in addition to reference leaks #57599

pitrou opened this issue Nov 12, 2011 · 31 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage tests Tests in the Lib/test dir

Comments

@pitrou
Copy link
Member

pitrou commented Nov 12, 2011

BPO 13390
Nosy @tim-one, @jcea, @ncoghlan, @pitrou, @vstinner, @benjaminp, @Trundle, @skrah, @davidmalcolm, @meadori
Files
  • debugblocks.patch
  • debugblocks2.patch
  • debugblocks3.patch
  • debugblocks5.patch
  • debugblocks6.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-12-09.13:30:47.585>
    created_at = <Date 2011-11-12.23:24:29.012>
    labels = ['interpreter-core', 'tests', 'performance']
    title = 'Hunt memory allocations in addition to reference leaks'
    updated_at = <Date 2012-12-17.22:07:47.372>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2012-12-17.22:07:47.372>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-12-09.13:30:47.585>
    closer = 'pitrou'
    components = ['Interpreter Core', 'Tests']
    creation = <Date 2011-11-12.23:24:29.012>
    creator = 'pitrou'
    dependencies = []
    files = ['23671', '23680', '23681', '28216', '28227']
    hgrepos = []
    issue_num = 13390
    keywords = ['patch']
    message_count = 31.0
    messages = ['147531', '147539', '147555', '147561', '147656', '147661', '147672', '148731', '148734', '148750', '148769', '148776', '148944', '148945', '149308', '149309', '149311', '177005', '177053', '177054', '177055', '177152', '177214', '177215', '177628', '177639', '177640', '177642', '177646', '177647', '177672']
    nosy_count = 13.0
    nosy_names = ['tim.peters', 'jcea', 'ncoghlan', 'pitrou', 'vstinner', 'techtonik', 'benjamin.peterson', 'Trundle', 'skrah', 'dmalcolm', 'meador.inge', 'neologix', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue13390'
    versions = ['Python 3.4']

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 12, 2011

    This patch adds a counting of the number of allocated memory blocks (through the PyObject_Malloc API). Together with -R, it can help chase those memory leaks which aren't reference leaks (see c6dafa2e2594).

    The sys.getallocedblocks() function is also available in non-debug mode. This is meant to help 3rd party extension writers, who rarely have access to debug builds.

    To avoid too many false positives, bpo-13389 is a prerequisite (at least for the "test -R" part of the patch). Even after it, there are still a couple "test -R" failures; we'd have to investigate them.

    @pitrou pitrou added interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir performance Performance or resource usage labels Nov 12, 2011
    @ncoghlan
    Copy link
    Contributor

    I added some review comments to the patch, but I'm not sure how usable this is going to be in practice. References generally stay fairly stable while using the interactive interpreter, but the new block accounting jumps around all over the place due to the internal free lists (which *don't* count in for 'gettotalrefcounts', but *do* count in the new block accounting). The following interpreter session has both this patch and the bpo-13389 patch applied:

    >>> a = sys.getallocedblocks()
    [76652 refs, 21773 blocks]
    >>> a
    21779
    [76652 refs, 21774 blocks]
    >>> x = [None]*10000
    [76652 refs, 21776 blocks]
    >>> del x
    [66650 refs, 21775 blocks]
    >>> gc.collect(); gc.collect(); gc.collect()
    0
    0
    0
    [66650 refs, 21756 blocks]
    >>> b = sys.getallocedblocks()
    [66652 refs, 21772 blocks]
    >>> b - a
    -2
    [66652 refs, 21772 blocks]

    So, generally +1 on the idea, but I think we should hide at least the initial implementation behind PY_REF_DEBUG until we're sure we've worked the kinks out of it.

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 13, 2011

    Thanks for the comments, here is a new patch addressing them.
    I've kept the C API available in all builds (since it's private), but sys.getallocatedblocks() is only available in debug builds.

    As for the memory leak run results, I think we may have to devise a heuristic where results like [1,0,1] (or similar variation of 0s, 1s, and -1s) are considered a success, but others like [1,1,1] are a failure.

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 13, 2011

    And the latest patch (debugblocks3.patch) adds said heuristic.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Nov 15, 2011

    Together with -R, it can help chase those memory leaks which aren't
    reference leaks (see c6dafa2e2594).

    Valgrind does a much better job at this: it will also show you where the leaked blocks were allocated.
    OTOH, Valgrind is Linux-only and slow, but since I haven't used the '-R' option much, I don't know how usable this will be in practice (detecting memory leaks is one thing, identifying them is even better :-).

    @ncoghlan
    Copy link
    Contributor

    This will likely be a decent "you have a problem" indicator, but you may still need tools like Valgrind to actually track down the *cause* of that problem.

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 15, 2011

    Valgrind does a much better job at this: it will also show you where
    the leaked blocks were allocated.
    OTOH, Valgrind is Linux-only and slow, but since I haven't used the
    '-R' option much, I don't know how usable this will be in practice
    (detecting memory leaks is one thing, identifying them is even
    better :-).

    Yes, Valgrind is much more exhaustive and precise, but you have to build
    Python --with-valgrind, to find the correct options to pass to Valgrind,
    and also to parse the output. For example, OpenSSL seems to cause lots
    of errors such as:

    ==20599== Conditional jump or move depends on uninitialised value(s)
    ==20599== at 0xA2BB0B3: BN_div (in /usr/lib64/libcrypto.so.1.0.0)
    ==20599== by 0xA2C125E: BN_nnmod (in /usr/lib64/libcrypto.so.1.0.0)
    ==20599== by 0xA2C15CD: BN_mod_mul (in /usr/lib64/libcrypto.so.1.0.0)
    ==20599== by 0xA2C37F0: BN_BLINDING_convert_ex
    (in /usr/lib64/libcrypto.so.1.0.0)
    ==20599== by 0xA2E15D6: ??? (in /usr/lib64/libcrypto.so.1.0.0)
    ==20599== by 0x9FE6363: ssl3_get_client_key_exchange
    (in /usr/lib64/libssl.so.1.0.0)
    ==20599== by 0x9FE83A7: ssl3_accept (in /usr/lib64/libssl.so.1.0.0)
    ==20599== by 0xF804190: PySSL_SSLdo_handshake (_ssl.c:390)
    ==20599== by 0x47C0E9: PyEval_EvalFrameEx (ceval.c:3985)
    ==20599== by 0x47CAC3: PyEval_EvalCodeEx (ceval.c:3376)
    ==20599== by 0x47B3F1: PyEval_EvalFrameEx (ceval.c:4099)
    ==20599== by 0x47C1DB: PyEval_EvalFrameEx (ceval.c:4089)

    Right now this patch will allow to enrich the daily refleak runs (those
    that send an e-mail to python-checkins). If someone finds a way to
    sanitize Valgrind output, a daily Valgrind run would be cool as well.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Dec 1, 2011

    If someone finds a way to sanitize Valgrind output,

    Did you use the valgrind suppression file (Misc/valgrind-python.supp)?
    If yes, then one could simply use --gen-suppressions=yes to add those spurious warning to the suppression file.

    a daily Valgrind run would be cool as well.

    Daily, or rather weakly, since running under valgrind is so much slower ;-)

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 1, 2011

    > If someone finds a way to sanitize Valgrind output,

    Did you use the valgrind suppression file (Misc/valgrind-python.supp)?

    Ah, I hadn't. But using it doesn't seem to make much of a difference.
    Also, the suppressions file seems quite inflexible (it has hardcoded
    library version numbers for third-party libs).

    @vstinner
    Copy link
    Member

    vstinner commented Dec 2, 2011

    The overhead on PyObject_Malloc() is just an increment on an integer, so it is very low (or null).

    The feature is interesting, but I'm not convinced that a very simple counter is enough to track memory leaks. It may help the CPython test suite, but what about real world application?

    I think we should hide at least the initial implementation
    behind PY_REF_DEBUG until we're sure we've worked the kinks
    out of it.

    Programs not always behave exactly the same in debug or in release mode. Sometimes, bugs disappear in debug mode :-(

    If the feature is written to track memory leaks in the CPython test suite, it's ok to only expose the function in debug mode.

    Right now this patch will allow to enrich the daily refleak runs

    Did you already found real leaks using your hack^Wpatch? (was c6dafa2e2594 found by your tool?)

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 2, 2011

    The feature is interesting, but I'm not convinced that a very simple
    counter is enough to track memory leaks. It may help the CPython test
    suite, but what about real world application?

    Good question. A simple counter is the only thing we can enable by default, though. Anything else would require recompiling Python, which is probably a barrier for most users.

    Did you already found real leaks using your hack^Wpatch? (was
    c6dafa2e2594 found by your tool?)

    yes, c6dafa2e2594 was found with this patch. It's the only one, though (there's also a leak in test_ctypes but I don't want to investigate :-)).

    @meadori
    Copy link
    Member

    meadori commented Dec 2, 2011

    On Fri, Dec 2, 2011 at 12:12 PM, Antoine Pitrou <report@bugs.python.org> wrote:

    yes, c6dafa2e2594 was found with this patch. It's the only one, though (there's also a leak in test_ctypes but I don't want to investigate :-)).

    I'll take a look at the ctypes leak.

    @meadori
    Copy link
    Member

    meadori commented Dec 6, 2011

    I looked at the 'ctypes' "leak" a bit. I haven't determined exactly what
    is going on, but the leak has something to do with a change in the patch that
    runs 'dash_R_cleanup' twice instead of once. The new behavior can be reduced
    to something like:

       import sys, ctypes, gc
    
       ctypes._reset_cache()
       gc.collect()
    
       for i in range(0, 5):
           ctypes._reset_cache()
           gc.collect()
           print("%d: start refs = %s" % (i, sys.gettotalrefcount()))
           proto = ctypes.CFUNCTYPE(ctypes.POINTER(ctypes.c_char))
           ctypes._reset_cache()
           gc.collect()
           print("%d: after refs = %s" % (i, sys.gettotalrefcount()))

    which prints:

    0: start refs = 71395
    0: after refs = 71462
    1: start refs = 71463
    1: after refs = 71493
    2: start refs = 71465
    2: after refs = 71494
    3: start refs = 71465
    3: after refs = 71494
    4: start refs = 71465
    4: after refs = 71494

    Note that the start/after refs converge on a difference of 29 references.

    The existing version 'regrtest.py' does something like:

       import sys, ctypes, gc
    
       ctypes._reset_cache()
       gc.collect()
    
       for i in range(0, 5):
           print("%d: start refs = %s" % (i, sys.gettotalrefcount()))
           proto = ctypes.CFUNCTYPE(ctypes.POINTER(ctypes.c_char))
           ctypes._reset_cache()
           gc.collect()
           print("%d: after refs = %s" % (i, sys.gettotalrefcount()))

    which prints:

    0: start refs = 71391
    0: after refs = 71458
    1: start refs = 71458
    1: after refs = 71489
    2: start refs = 71489
    2: after refs = 71490
    3: start refs = 71490
    3: after refs = 71490
    4: start refs = 71490
    4: after refs = 71490

    This one converges on a difference of zero.

    So, I am not sure whether there really is a leak, if this is just
    a very senstive area of 'regrtest.py', or something else I am missing.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 6, 2011

    So, I am not sure whether there really is a leak, if this is just
    a very senstive area of 'regrtest.py', or something else I am missing.

    Well, test_ctypes seems to be the only test exhibiting that behaviour.
    And since your script reproduces it, it's probably not regrtest.py in
    itself.

    @techtonik
    Copy link
    Mannequin

    techtonik mannequin commented Dec 12, 2011

    How different is the performance cost of this solution compared to inserting DTrace probe for the same purpose?

    @vstinner
    Copy link
    Member

    How different is the performance cost of this solution compared
    to inserting DTrace probe for the same purpose?

    DTrace is only available on some platforms (Solaris and maybe FreeBSD?).

    @techtonik
    Copy link
    Mannequin

    techtonik mannequin commented Dec 12, 2011

    On Mon, Dec 12, 2011 at 6:11 PM, STINNER Victor <report@bugs.python.org>wrote:

    STINNER Victor <victor.stinner@haypocalc.com> added the comment:

    > How different is the performance cost of this solution compared
    > to inserting DTrace probe for the same purpose?

    DTrace is only available on some platforms (Solaris and maybe FreeBSD?).

    Solaris <http://en.wikipedia.org/wiki/Solaris_(operating_system)\>, Mac
    OS X<http://en.wikipedia.org/wiki/Mac_OS_X\>
    , FreeBSD <http://en.wikipedia.org/wiki/FreeBSD\>,
    NetBSD<http://en.wikipedia.org/wiki/NetBSD\>
    , Oracle Linux <http://en.wikipedia.org/wiki/Oracle_Linux\> according to
    http://en.wikipedia.org/wiki/DTrace#Supported_platforms, but that doesn't
    relate to the question.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 5, 2012

    Here is an updated patch.
    test_ctypes still leaks memory blocks with it:

    $ ./python -m test -R 3:8 test_ctypes
    [1/1] test_ctypes
    beginning 11 repetitions
    12345678901
    ...........
    test_ctypes leaked [2, 2, 1, 1, 1, 1, 1, 1] memory blocks, sum=10

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 6, 2012

    As it turns out, ctypes does leak memory: see bpo-16628.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 6, 2012

    Updated patch with doc. If noone objects, I will commit soon.

    @davidmalcolm
    Copy link
    Member

    Minor bikeshedding/spelling nit: should
    "_Py_AllocedBlocks"
    be changed to
    "_Py_AllocatedBlocks"

    (and s/_Py_GetAllocedBlocks/_Py_GetAllocatedBlocks/)?

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 8, 2012

    Minor bikeshedding/spelling nit: should
    "_Py_AllocedBlocks"
    be changed to
    "_Py_AllocatedBlocks"
    (and s/_Py_GetAllocedBlocks/_Py_GetAllocatedBlocks/)?

    Right indeed. I'll do the change.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 9, 2012

    New changeset c40f4c19d20b by Antoine Pitrou in branch 'default':
    Issue bpo-13390: New function :func:`sys.getallocatedblocks()` returns the number of memory blocks currently allocated.
    http://hg.python.org/cpython/rev/c40f4c19d20b

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 9, 2012

    Committed and pushed!

    @pitrou pitrou closed this as completed Dec 9, 2012
    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Dec 16, 2012

    ``./configure --without-pymalloc'' fails here:

    gcc -pthread -Xlinker -export-dynamic -o python Modules/python.o libpython3.4.a -lpthread -ldl -lutil -lm
    libpython3.4.a(sysmodule.o): In function sys_getallocatedblocks': /home/stefan/hg/cpython/./Python/sysmodule.c:900: undefined reference to _Py_GetAllocatedBlocks'
    collect2: error: ld returned 1 exit status
    make: *** [python] Error 1

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 17, 2012

    ``./configure --without-pymalloc'' fails here:

    gcc -pthread -Xlinker -export-dynamic -o python Modules/python.o libpython3.4.a -lpthread -ldl -lutil -lm
    libpython3.4.a(sysmodule.o): In function sys_getallocatedblocks': /home/stefan/hg/cpython/./Python/sysmodule.c:900: undefined reference to _Py_GetAllocatedBlocks'
    collect2: error: ld returned 1 exit status
    make: *** [python] Error 1

    Hmm, interesting. When built --without-pymalloc, we could make
    sys.getallocatedblocks() always return 0, or we could not define it all
    (which would make like a bit harder for regrtest). What do you think?

    @vstinner
    Copy link
    Member

    sys.gettotalrefcount() is only defined when Python is compiled in
    debug mode. sys.getallocatedblocks() should only be available when the
    right debug option is present. This function is specific to CPython
    anyway, Python module should not rely on this (too much) ;-)

    @techtonik
    Copy link
    Mannequin

    techtonik mannequin commented Dec 17, 2012

    Memory control over the stuff that Python creates is a practical feature that compensates OS disability for tracking memory usage. If all Python scripts could measure their memory usage, we could see more memory effective and adaptive programs around.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 17, 2012

    sys.gettotalrefcount() is only defined when Python is compiled in
    debug mode. sys.getallocatedblocks() should only be available when
    the
    right debug option is present. This function is specific to CPython
    anyway, Python module should not rely on this (too much) ;-)

    On the contrary, the aim is precisely to provide a memory statistics
    function which is available for everyone, not only CPython developers.
    It is simply not practical right now for a C extension developer to
    check for memory leaks.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Dec 17, 2012

    Antoine Pitrou <report@bugs.python.org> wrote:

    Hmm, interesting. When built --without-pymalloc, we could make
    sys.getallocatedblocks() always return 0, or we could not define it all
    (which would make like a bit harder for regrtest). What do you think?

    Given the name getallocatedblocks(), it would seem reasonable to return
    0 in this case and document the fact. I don't think many people use
    --without-pymalloc anyhow.

    I'm using the option only for Valgrind testing, that's how I found
    the build error.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 17, 2012

    New changeset a85673b55177 by Antoine Pitrou in branch 'default':
    Following issue bpo-13390, fix compilation --without-pymalloc, and make sys.getallocatedblocks() return 0 in that situation.
    http://hg.python.org/cpython/rev/a85673b55177

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants