This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: Py_BuildValue may leak 'N' arguments on PyTuple_New failure
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.6, Python 3.5, Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: josh.r, martin.panter, mwh, python-dev, rhettinger, serhiy.storchaka, shredwheat, squidevil
Priority: normal Keywords: patch

Created on 2016-01-20 18:26 by squidevil, last changed 2022-04-11 14:58 by admin. This issue is now closed.

File name Uploaded Description Edit
pybuildvalue_leak.patch serhiy.storchaka, 2016-01-29 11:08 review
pybuildvalue_leak2.patch serhiy.storchaka, 2016-01-30 11:23 review
pybuildvalue_leak3.patch serhiy.storchaka, 2016-02-08 13:05 review
pybuildvalue_leak4.patch serhiy.storchaka, 2016-05-15 12:52 review
Messages (17)
msg258708 - (view) Author: (squidevil) Date: 2016-01-20 18:26
Expected behavior:
  Calling Py_BuildValue with a 'N' argument should take ownership of the N object, and on failure (returns NULL), call Py_DECREF() on any N argument.  The documentation explicitly says that this is the intended usage:
    "N": Same as "O", except it doesn't increment the reference count on the object. Useful when the object is created by a call to an object constructor in the argument list.

Actual behavior:
  N objects appear to be abandoned/leaked in some cases.

Example: PyBuildValue("iN", 0, obj);

* calls _Py_BuildValue_SizeT via macro
* calls va_build_value (in modsupport.c)
* calls do_mktuple [0]
* [0] first calls v = PyTuple_New(n=2). If this fails, it returns NULL, leaking obj.
* if [0] creates the tuple v, then it goes on to populate the values in the tuple.
* [0] calls do_mkvalue() to create the "i=0" object.  If this fails, obj is never Py_DECREF()'ed.

Many other leaks are possible, as long as at least one allocation occurs prior to the processing of the N arguments.
msg259204 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-29 11:08
Current code already decref obj if creating previous argument is failed. But a leak if PyTuple_New() fails is possible.

Proposed patch fixes this leak and adds a test for the case when creating previous argument is failed. Testing the failure of PyTuple_New() is problematic.
msg259222 - (view) Author: (squidevil) Date: 2016-01-29 21:40
It looks like this patch is against the "default" cpython (3.x) branch.  The 2.7 branch is missing the if(itemfailed) code in do_mktuple whose else clause was modified by the patch.

It looks like some of this needs to be backported to 2.7?
msg259224 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-29 23:57
Left a review.

Squidevil: You say if do_mkvalue() fails, the N object is not released. But looking at the code, I think it gets stored in the tuple, and then the tuple is released: <>. So the only leak I can see is if the tuple construction fails.

Serhiy’s patch fixes do_mktuple(), but I think the same problem would affect do_mkdict() and do_mklist() as well.
msg259226 - (view) Author: (squidevil) Date: 2016-01-30 00:22
Martin Panter: You're right.  The DECREF on v when itemfailed will decref the N object and prevent the leak.  My original analysis was wrong on that count.

You're right, do_mklist and do_mkdict (in 2.7.11 at least) have similar problems, bailing after list or dict creation failure, without continuing to process the rest of the items.
msg259242 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-30 07:39
Rather rare reference leak is not the worst bug here.

Following example

    const char *s = ...;
    PyObject *b = PyBytes_New(...);
    return PyBuildValue("(Ns)s", b, s, PyBytes_AS_STRING(b));

works if s is correct UTF-8 encoded string. But if it is not correct UTF-8 encoded string, the first "s" is failed and the inner tuple is deallocated together with the borrowed reference to b. The second "s" then reads freed memory.
msg259252 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-30 11:23
Previous attempt to fix reference leaks was in issue984722.

Proposed patch uses different approach. It correctly fixes reference leaks after error in processing dict items (like "{()s(())N"), fixes leaksafter memory errors on creating tuple, list or dict, fixes issue20024 for lists and dicts, fixes use-after-free issue described in msg259242.

But the patch has a disadvantage in comparison with current code. If user converter ("O&") accepts Python objects and steal its reference (as "N"), this reference will be leaked. This is a price for avoiding potential use-after-free issues. Current stdlib code doesn't use converters that steal Python object reference. May be we should document this limitation.
msg259254 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-01-30 11:36
I was going to suggest just documenting your use-after-free case as a limitation of the N format. I doubt there is much demand to give away a borrowed reference with N _and_ rely on that borrowed reference staying alive for other parameters. It seems a lot of churn just to avoid a theoretical problem.
msg259841 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-08 13:05
Following patch is written in the style of current code for tuples.
msg265504 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-14 04:27
Patch 3 looks good to me
msg265515 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-14 10:51
Thank you Martin.
msg265612 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-15 12:52
I'm not happy with pybuildvalue_leak3.patch. For failed keys it saves values with the same key (None). This means that old value can be deallocated before the end of building all dict.

Following patch collects all values after error in a tuple. This not only fixes the issue with building dict, but makes the code for building tuple, list and dict cleaner. It no longer contains the code for processing after error, it is moved in separate function.
msg265658 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-16 02:06
Is it really a problem if the old value is deallocated? It sounds like a similar case to <>, and would only be a problem if you passed a borrowed reference, and relied on the reference staying alive for another argument.

I do like the separate do_ignore() function in patch 4, but I don’t think it is worthwhile allocating a temporary tuple to save values in. The allocation can fail. Also, I understand do_mktuple() etc are recursive, so nested borrowed references would still be released before the outer do_ignore() function releases the outer tuple.
msg265679 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-16 07:58
Currently do_mktuple(), do_mklist() and do_mkdict() save references in a collection. I think that there are reasons to do this, and third-party code can be broken if just deallocate references. pybuildvalue_leak4.patch just implements this strategy more reliably. It guaranties (unless tuple allocation fails) that borrowed references in the same container or upper levels are kept until all content of the container is proceeded.

The more perfect solution is to allocate one list on error, save all references to it and deallocate it at the end of PyBuildValue(). But this leads to more complicated code.
msg265684 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-16 08:19
Yeah okay, I agree that the code was already saving the references in a container, so it makes sense to keep doing that.
msg265954 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-05-20 19:32
New changeset 4e605f809551 by Serhiy Storchaka in branch '3.5':
Issue #26168: Fixed possible refleaks in failing Py_BuildValue() with the "N"

New changeset 0285173d81b4 by Serhiy Storchaka in branch '2.7':
Issue #26168: Fixed possible refleaks in failing Py_BuildValue() with the "N"

New changeset 03b32f854680 by Serhiy Storchaka in branch 'default':
Issue #26168: Fixed possible refleaks in failing Py_BuildValue() with the "N"
msg265955 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-20 19:33
Thank you for your review Martin.
Date User Action Args
2022-04-11 14:58:26adminsetgithub: 70356
2016-05-20 19:33:34serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg265955

stage: commit review -> resolved
2016-05-20 19:32:26python-devsetnosy: + python-dev
messages: + msg265954
2016-05-16 08:19:20martin.pantersetmessages: + msg265684
2016-05-16 07:58:03serhiy.storchakasetmessages: + msg265679
2016-05-16 02:06:24martin.pantersetmessages: + msg265658
2016-05-15 12:52:00serhiy.storchakasetfiles: + pybuildvalue_leak4.patch

messages: + msg265612
2016-05-14 10:51:04serhiy.storchakasetmessages: + msg265515
stage: patch review -> commit review
2016-05-14 04:27:29martin.pantersetmessages: + msg265504
2016-02-08 13:05:29serhiy.storchakasetfiles: + pybuildvalue_leak3.patch

messages: + msg259841
2016-01-30 11:36:59martin.pantersetmessages: + msg259254
2016-01-30 11:23:39serhiy.storchakasetfiles: + pybuildvalue_leak2.patch
nosy: + mwh, rhettinger, shredwheat
messages: + msg259252

2016-01-30 07:39:25serhiy.storchakasettype: resource usage -> crash
messages: + msg259242
2016-01-30 00:22:59squidevilsetmessages: + msg259226
2016-01-29 23:57:13martin.pantersetnosy: + martin.panter
messages: + msg259224
2016-01-29 21:40:30squidevilsetmessages: + msg259222
2016-01-29 11:08:51serhiy.storchakasetfiles: + pybuildvalue_leak.patch
messages: + msg259204

assignee: serhiy.storchaka
keywords: + patch
stage: needs patch -> patch review
2016-01-20 19:33:40josh.rsetnosy: + josh.r
2016-01-20 18:27:52serhiy.storchakasetnosy: + serhiy.storchaka
stage: needs patch

versions: + Python 3.5, Python 3.6
2016-01-20 18:26:25squidevilcreate