classification
Title: Possible refleak in _asynciomodule.c future_add_done_callback()
Type: resource usage Stage: resolved
Components: C API, Extension Modules Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: aeros Nosy List: ZackerySpytz, aeros, asvetlov, miss-islington, pitrou, yselivanov
Priority: high Keywords: patch

Created on 2020-03-25 09:36 by aeros, last changed 2020-05-30 08:25 by aeros. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19748 merged ZackerySpytz, 2020-04-28 05:37
Messages (6)
msg364985 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-03-25 09:36
When using the `test-with-buildbots` label in GH-19149 (which involved no C changes), a failure occurred in test_asyncio for several of the refleak buildbots. Here's the output of a few:

AMD64 Fedora Stable Refleaks PR:

test_asyncio leaked [3, 3, 27] references, sum=33
test_asyncio leaked [3, 3, 28] memory blocks, sum=34
2 tests failed again:
    test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==

AMD64 RHEL8 Refleaks PR:

test_asyncio leaked [3, 3, 3] references, sum=9
test_asyncio leaked [3, 3, 3] memory blocks, sum=9
2 tests failed again:
    test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==

RHEL7 Refleaks PR:

test_asyncio leaked [3, 3, 3] references, sum=9
test_asyncio leaked [3, 3, 3] memory blocks, sum=9
2 tests failed again:
    test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==


I'm unable to replicate it locally, but I think I may have located a subtle, uncommon refleak in `future_add_done_callback()`, within _asynciomodule.c. Specifically:

```
            PyObject *tup = PyTuple_New(2);
            if (tup == NULL) {
                return NULL;
            }
            Py_INCREF(arg);
            PyTuple_SET_ITEM(tup, 0, arg);
            Py_INCREF(ctx);
            PyTuple_SET_ITEM(tup, 1, (PyObject *)ctx);

            if (fut->fut_callbacks != NULL) {
                int err = PyList_Append(fut->fut_callbacks, tup);
                if (err) {
                    Py_DECREF(tup);
                    return NULL;
                }
                Py_DECREF(tup);
            }
            else {
                fut->fut_callbacks = PyList_New(1);
                if (fut->fut_callbacks == NULL) {
                    // Missing ``Py_DECREF(tup);`` ?
                    return NULL;
                }
```

(The above code is located at: https://github.com/python/cpython/blob/7668a8bc93c2bd573716d1bea0f52ea520502b28/Modules/_asynciomodule.c#L664-L685)

In the above conditional for "if (fut->fut_callbacks == NULL)", it appears that `tup` is pointing to a non-NULL new reference at this point, and thus should be decref'd prior to returning NULL. Otherwise, it seems like it could be leaked.

But, I would appreciate it if someone could double check this (the C-API isn't an area I'm experienced); particularly since this code has been in place for a decent while (since 3.7). I _suspect_ it's gone undetected and only failed intermittently because this specific ``return NULL`` path is rather uncommon.

I'd be glad to open a PR to address the issue, assuming I'm not missing something with the above refleak. Otherwise, feel free to correct me.
msg365029 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-03-25 23:06
> When using the `test-with-buildbots` label in GH-19149 (which involved no C changes), a failure occurred in test_asyncio for several of the refleak buildbots.

To clarify, the refleak failures occurred when i applied the label to the following commit: https://github.com/python/cpython/pull/19149/commits/21a12e9340644c81e758ddf20fc9034f265d1930
msg367507 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2020-04-28 07:56
You're right that a Py_DECREF is missing there.  However, it seems unlikely that this is triggering the test failure, because `PyList_New(1)` will practically never fail in normal conditions (and even making it deliberately fail would be quite difficult).
msg367511 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-04-28 09:01
Antoine Pitrou wrote:
> You're right that a Py_DECREF is missing there.  However, it seems unlikely that this is triggering the test failure, because `PyList_New(1)` will practically never fail in normal conditions (and even making it deliberately fail would be quite difficult).

Thanks for clarifying; spotting refleaks is something that I've only recently started to get the hang of, so I wanted to double check first. :-)

I should have updated this issue, but I resolved the main test_asyncio failure that I saw from GH-19149 in GH-19282. Either way though, I think this could still be fixed (even if it would rarely be an issue in most situations).
msg370365 - (view) Author: miss-islington (miss-islington) Date: 2020-05-30 08:22
New changeset 7b78e7f9fd77bb3280ee39fb74b86772a7d46a70 by Zackery Spytz in branch 'master':
bpo-40061: Fix a possible refleak in _asynciomodule.c (GH-19748)
https://github.com/python/cpython/commit/7b78e7f9fd77bb3280ee39fb74b86772a7d46a70
msg370366 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-05-30 08:25
Thanks for the PR, Zackery.
History
Date User Action Args
2020-05-30 08:25:36aerossetstatus: open -> closed
resolution: fixed
messages: + msg370366

stage: patch review -> resolved
2020-05-30 08:22:06miss-islingtonsetnosy: + miss-islington
messages: + msg370365
2020-04-28 09:01:16aerossetmessages: + msg367511
2020-04-28 07:56:16pitrousetnosy: + pitrou
messages: + msg367507
2020-04-28 05:37:02ZackerySpytzsetkeywords: + patch
nosy: + ZackerySpytz

pull_requests: + pull_request19069
stage: patch review
2020-03-25 23:06:06aerossetmessages: + msg365029
2020-03-25 09:36:53aeroscreate