When using the `test-with-buildbots` label in GH-19149 (which involved no C changes), a failure occurred in test_asyncio for several of the refleak buildbots. Here's the output of a few:
AMD64 Fedora Stable Refleaks PR:
test_asyncio leaked [3, 3, 27] references, sum=33
test_asyncio leaked [3, 3, 28] memory blocks, sum=34
2 tests failed again:
test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==
AMD64 RHEL8 Refleaks PR:
test_asyncio leaked [3, 3, 3] references, sum=9
test_asyncio leaked [3, 3, 3] memory blocks, sum=9
2 tests failed again:
test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==
RHEL7 Refleaks PR:
test_asyncio leaked [3, 3, 3] references, sum=9
test_asyncio leaked [3, 3, 3] memory blocks, sum=9
2 tests failed again:
test__xxsubinterpreters test_asyncio
== Tests result: FAILURE then FAILURE ==
I'm unable to replicate it locally, but I think I may have located a subtle, uncommon refleak in `future_add_done_callback()`, within _asynciomodule.c. Specifically:
```
PyObject *tup = PyTuple_New(2);
if (tup == NULL) {
return NULL;
}
Py_INCREF(arg);
PyTuple_SET_ITEM(tup, 0, arg);
Py_INCREF(ctx);
PyTuple_SET_ITEM(tup, 1, (PyObject *)ctx);
if (fut->fut_callbacks != NULL) {
int err = PyList_Append(fut->fut_callbacks, tup);
if (err) {
Py_DECREF(tup);
return NULL;
}
Py_DECREF(tup);
}
else {
fut->fut_callbacks = PyList_New(1);
if (fut->fut_callbacks == NULL) {
// Missing ``Py_DECREF(tup);`` ?
return NULL;
}
```
(The above code is located at: https://github.com/python/cpython/blob/7668a8bc93c2bd573716d1bea0f52ea520502b28/Modules/_asynciomodule.c#L664-L685)
In the above conditional for "if (fut->fut_callbacks == NULL)", it appears that `tup` is pointing to a non-NULL new reference at this point, and thus should be decref'd prior to returning NULL. Otherwise, it seems like it could be leaked.
But, I would appreciate it if someone could double check this (the C-API isn't an area I'm experienced); particularly since this code has been in place for a decent while (since 3.7). I _suspect_ it's gone undetected and only failed intermittently because this specific ``return NULL`` path is rather uncommon.
I'd be glad to open a PR to address the issue, assuming I'm not missing something with the above refleak. Otherwise, feel free to correct me.
|