classification
Title: test_type_comments leaks references
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: gvanrossum, lukasz.langa, pablogsal
Priority: normal Keywords: patch, patch, patch

Created on 2019-02-01 11:30 by vstinner, last changed 2019-02-01 23:31 by gvanrossum. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 11728 merged gvanrossum, 2019-02-01 23:03
PR 11728 merged gvanrossum, 2019-02-01 23:03
PR 11728 merged gvanrossum, 2019-02-01 23:03
Messages (13)
msg334671 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-02-01 11:30
https://buildbot.python.org/all/#/builders/1/builds/490

https://buildbot.python.org/all/#/builders/80/builds/499

test_type_comments leaked [37, 37, 37] references, sum=111
test_type_comments leaked [11, 12, 11] memory blocks, sum=34

Example:

vstinner@apu$ ./python -m test -R 3:3 test_type_comments -m test.test_type_comments.TypeCommentTests.test_asyncdef
Run tests sequentially
0:00:00 load avg: 0.56 [1/1] test_type_comments
beginning 6 repetitions
123456
......
test_type_comments leaked [2, 2, 2] references, sum=6
test_type_comments leaked [2, 3, 2] memory blocks, sum=7
test_type_comments failed

== Tests result: FAILURE ==

1 test failed:
    test_type_comments

Total duration: 129 ms
Tests result: FAILURE


I bet that the regression comes from:

commit dcfcd146f8e6fc5c2fc16a4c192a0c5f5ca8c53c
Author: Guido van Rossum <guido@python.org>
Date:   Thu Jan 31 03:40:27 2019 -0800

    bpo-35766: Merge typed_ast back into CPython (GH-11645)


See also bpo-35878.
msg334682 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-01 14:12
From the duplicate (https://bugs.python.org/issue35881) :

After some hours of debugging, I have identified the (possible) cause of the majority of the refleaks. 

The type_comments created in ast.c with new_type_comment() never reaches 0 reference counts. This **seems** to be because the cleanup for the stmt_ty elements where the type_comments are included never call DECREF on these elements when destroyed unless I am missing something fundamental.
msg334702 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 16:42
I can confirm it, but I don't understand it yet. Somehow the type_comment fields get an extra reference count and are never freed. How are other string fields handled?
msg334708 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-01 17:22
The extra referenced happen here in Python-ast.c :

value = ast2obj_string(o->type_comment);
    if (!value) goto failed;
    if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1)
        goto failed;
    Py_DECREF(value);

ast2obj_string increments it once and the setattr does it again and then there is only one Py_DECREF.
msg334709 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 17:35
OK, let me make a PR, I found another leak for type:ignore.

On Fri, Feb 1, 2019 at 9:22 AM Pablo Galindo Salgado <report@bugs.python.org>
wrote:

>
> Pablo Galindo Salgado <pablogsal@gmail.com> added the comment:
>
> The extra referenced happen here in Python-ast.c :
>
> value = ast2obj_string(o->type_comment);
>     if (!value) goto failed;
>     if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1)
>         goto failed;
>     Py_DECREF(value);
>
> ast2obj_string increments it once and the setattr does it again and then
> there is only one Py_DECREF.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue35879>
> _______________________________________
>
msg334713 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 19:36
> ast2obj_string increments it once and the setattr does it again and then there is only one Py_DECREF.

Actually I don't understand how this is the leak. Adding another Py_DECREF(value) causes an immediate crash. I've been trying to follow other paths but haven't found the root yet.
msg334717 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-01 20:21
Hummmm.... I may be missing something but these is what I am seeing:

BASELINE
--------

❯ ./python.exe -m test test_type_comments -R 10:10
Run tests sequentially
0:00:00 load avg: 2.30 [1/1] test_type_comments
beginning 20 repetitions
12345678901234567890
....................
test_type_comments leaked [37, 37, 37, 37, 37, 37, 37, 37, 37, 37] references, sum=370
test_type_comments leaked [11, 12, 11, 11, 11, 11, 11, 11, 11, 11] memory blocks, sum=111
test_type_comments failed

== Tests result: FAILURE ==

1 test failed:
    test_type_comments

Total duration: 1 sec 28 ms
Tests result: FAILURE


EXTRA_DECREF
------------
❯ git --no-pager diff
diff --git a/Python/Python-ast.c b/Python/Python-ast.c
index 1a56e90bca..cb3036dbba 100644
--- a/Python/Python-ast.c
+++ b/Python/Python-ast.c
@@ -2791,6 +2791,7 @@ ast2obj_stmt(void* _o)
         if (_PyObject_SetAttrId(result, &PyId_type_comment, value) == -1)
             goto failed;
         Py_DECREF(value);
+        Py_DECREF(value);
         break;
     case AsyncFunctionDef_kind:
         result = PyType_GenericNew(AsyncFunctionDef_type, NULL, NULL);

~/github/cpython master ✗
❯ ./python.exe -m test test_type_comments -R 10:10
Run tests sequentially
0:00:00 load avg: 2.12 [1/1] test_type_comments
beginning 20 repetitions
12345678901234567890
....................
test_type_comments leaked [8, 9, 8, 8, 8, 8, 8, 8, 8, 8] memory blocks, sum=81
test_type_comments failed

== Tests result: FAILURE ==

1 test failed:
    test_type_comments

Total duration: 976 ms
Tests result: FAILURE


So my interpretation is that there are still leaks somewhere or, but the extra-decref resolves the reference leaks somehow. Probably the extra DECREF here is not the proper fix, but is as far as I went when I started debugging this morning. Also, if you change one test to do the parsing in a loop:


     def test_funcdef(self):
-        tree = self.parse(funcdef)
+        for _ in range(100000):
+            tree = self.parse(funcdef)
         self.assertEqual(tree.body[0].type_comment, "() -> int")


And you ran test_type_comments.py with PYTHONDUMPREFS=1:

WITHOUT EXTRA DECREF
--------------------

  65 wrapper_descriptor
  75 str
  82 str
  85 str
  86 str
  92 str
 110 str
 416 str
200015 str


WITH EXTRA DECREF
-----------------

PYTHONDUMPREFS=1 ./python.exe -m test test_type_comments |& cut -d " " -f 3 | uniq -c | sort -n
....
  85 str
  86 str
  92 str
 110 str
 416 str
msg334718 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-01 20:24
Note:

the extra DECREF will fail in other tests because it will deallocate None if ast2obj_object returns Py_None because the comment is NULL.
msg334721 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-01 20:43
I'm almost sure this is not the correct fix, but I think these findings point to that object as the one being leaked. I do not understand how this object should be deallocated and where, I am trying to follow how the other PyObject* (name) is deallocated but I am not finding anything :(
msg334722 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 21:26
I have been trying to do the same thing and likewise, no luck understanding yet how type comments are different here.
msg334723 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 21:33
In the meantime here's the fix for the '# type: ignore' leak. https://github.com/gvanrossum/cpython/commit/ef3b9e952906aab8f5452ebcf3b9e359d35615b4
msg334730 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 23:09
Found it! You need to call PyArena_AddPyObject(). I racked my brain and finally found the example code I needed at the end of new_identifier(). Please review quickly so we won't ship 3.8.0a1 with a memory leak.
msg334731 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-02-01 23:28
New changeset d2b4c19d53f5f021fb1c7c32d48033a92ac4fe49 by Guido van Rossum in branch 'master':
bpo-35879: Fix type comment leaks (GH-11728)
https://github.com/python/cpython/commit/d2b4c19d53f5f021fb1c7c32d48033a92ac4fe49
History
Date User Action Args
2019-02-01 23:31:43gvanrossumsetstatus: open -> closed
type: behavior
assignee: gvanrossum
keywords: patch, patch, patch
resolution: fixed
stage: patch review -> resolved
2019-02-01 23:28:21gvanrossumsetmessages: + msg334731
2019-02-01 23:09:48gvanrossumsetkeywords: patch, patch, patch

messages: + msg334730
2019-02-01 23:04:07gvanrossumsetkeywords: + patch
stage: patch review
pull_requests: + pull_request11616
2019-02-01 23:04:00gvanrossumsetkeywords: + patch
stage: (no value)
pull_requests: + pull_request11615
2019-02-01 23:03:52gvanrossumsetkeywords: + patch
stage: (no value)
pull_requests: + pull_request11614
2019-02-01 21:33:29gvanrossumsetmessages: + msg334723
2019-02-01 21:26:37gvanrossumsetmessages: + msg334722
2019-02-01 20:43:17pablogsalsetmessages: + msg334721
2019-02-01 20:24:22pablogsalsetmessages: + msg334718
2019-02-01 20:21:21pablogsalsetmessages: + msg334717
2019-02-01 19:36:11gvanrossumsetmessages: + msg334713
2019-02-01 17:35:35gvanrossumsetmessages: + msg334709
2019-02-01 17:22:50pablogsalsetmessages: + msg334708
2019-02-01 16:43:46vstinnersetnosy: - vstinner
2019-02-01 16:42:32gvanrossumsetmessages: + msg334702
2019-02-01 14:12:24pablogsalsetmessages: + msg334682
2019-02-01 14:10:04xtreaksetnosy: + pablogsal
2019-02-01 14:09:03vstinnerlinkissue35881 superseder
2019-02-01 11:30:23vstinnersetnosy: + gvanrossum
2019-02-01 11:30:04vstinnercreate