classification
Title: [C API] Heap types (PyType_FromSpec) must fully implement the GC protocol
Type: behavior Stage: patch review
Components: C API Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, corona10, erlendaasland, kj, miss-islington, nascheme, ncoghlan, pablogsal, shihai1991, soffieswan015, vstinner
Priority: Keywords: patch

Created on 2021-01-19 21:54 by vstinner, last changed 2021-06-15 13:29 by miss-islington.

Pull Requests
URL Status Linked Edit
PR 26104 merged erlendaasland, 2021-05-13 17:29
PR 26114 merged erlendaasland, 2021-05-13 22:12
PR 26361 merged miss-islington, 2021-05-25 17:44
PR 26362 merged miss-islington, 2021-05-25 18:26
PR 26363 merged erlendaasland, 2021-05-25 20:14
PR 26368 merged erlendaasland, 2021-05-25 20:40
PR 26370 merged erlendaasland, 2021-05-25 21:28
PR 26371 merged erlendaasland, 2021-05-25 21:31
PR 26372 merged erlendaasland, 2021-05-25 21:49
PR 26373 merged erlendaasland, 2021-05-25 21:51
PR 26374 merged erlendaasland, 2021-05-25 22:14
PR 26376 merged erlendaasland, 2021-05-26 09:48
PR 26381 merged kj, 2021-05-26 16:03
PR 26397 merged miss-islington, 2021-05-27 07:29
PR 26398 merged miss-islington, 2021-05-27 07:48
PR 26399 merged miss-islington, 2021-05-27 07:50
PR 26406 merged miss-islington, 2021-05-27 15:50
PR 26407 merged miss-islington, 2021-05-27 15:54
PR 26411 closed miss-islington, 2021-05-27 17:23
PR 26413 merged miss-islington, 2021-05-27 20:59
PR 26414 merged miss-islington, 2021-05-27 21:59
PR 26423 closed erlendaasland, 2021-05-28 08:25
PR 26424 merged miss-islington, 2021-05-28 08:41
PR 26425 merged miss-islington, 2021-05-28 09:02
PR 26426 closed miss-islington, 2021-05-28 09:06
PR 26427 closed miss-islington, 2021-05-28 09:42
PR 26429 merged kj, 2021-05-28 11:34
PR 26430 merged kj, 2021-05-28 12:02
PR 26431 closed miss-islington, 2021-05-28 14:10
PR 26451 merged shihai1991, 2021-05-29 14:33
PR 26452 merged erlendaasland, 2021-05-29 18:29
PR 26460 merged miss-islington, 2021-05-31 07:51
PR 26461 merged miss-islington, 2021-05-31 08:25
PR 26475 merged erlendaasland, 2021-05-31 22:49
PR 26515 merged erlendaasland, 2021-06-03 15:26
PR 26734 merged vstinner, 2021-06-15 11:52
PR 26735 merged miss-islington, 2021-06-15 13:09
Messages (57)
msg385297 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-19 21:54
Copy of my email sent to python-dev:
https://mail.python.org/archives/list/python-dev@python.org/thread/C4ILXGPKBJQYUN5YDMTJOEOX7RHOD4S3/

Hi,

In the Python stdlib, many heap types currently don't "properly"
(fully?) implement the GC protocol which can prevent to destroy these
types at Python exit. As a side effect, some other Python objects can
also remain alive, and so are not destroyed neither.

There is an on-going effect to destroy all Python objects at exit
(bpo-1635741). This problem is getting worse when subinterpreters are
involved: Refleaks buildbots failures which prevent to spot other
regressions, and so these "leaks" / "GC bugs" must be fixed as soon as
possible. In my experience, many leaks spotted by tests using
subinterpreters were quite old, it's just that they were ignored
previously.

It's an hard problem and I don't see any simple/obvious solution right
now, except of workarounds that I dislike. Maybe the only good
solution is to fix all heap types, one by one.

== Only the Python stdlib should be affected ==

PyType_FromSpec() was added to Python 3.2 by the PEP 384 to define
"heap types" in C, but I'm not sure if it's popular in practice (ex:
Cython doesn't use it, but defines static types). I expect that most
types to still be defined the old style (static types) in a vas
majority of third party extension modules.

To be clear, static types are not affected by this email.

Third party extension modules using the limited C API (to use the
stable ABI) and PyType_FromSpec() can be affected (if they don't fully
implement the GC protocol).

== Heap type instances now stores a strong reference to their type ==

In March 2019, the PyObject_Init() function was modified in bpo-35810
to keep a strong reference (INCREF) to the type if the type is a heap
type. The fixed problem was that heap types could be destroyed before
the last instance is destroyed.

== GC and heap types ==

The new problem is that most heap types don't collaborate well with
the garbage collector. The garbage collector doesn't know anything
about Python objects, types, reference counting or anything. It only
uses the PyGC_Head header and the traverse functions. If an object
holds a strong reference to an object but its type does not define a
traverse function, the GC cannot guess/infer this reference.

A heap type must respect the following 3 conditions to collaborate with the GC:

    Have the Py_TPFLAGS_HAVE_GC flag;
    Define a traverse function (tp_traverse) which visits the type: Py_VISIT(Py_TYPE(self));
    Instances must be tracked by the GC.

If one of these conditions is not met, the GC can fail to destroy a
type during a GC collection. If an instance is kept alive late while a
Python interpreter is being deleted, it's possible that the type is
never deleted, which can keep indirectly many objects alive and so
don't delete them neither.

In practice, when a type is not deleted, a test using subinterpreter
starts to fail on Refleaks buildbot since it leaks references. Without
subinterpreters, such leak is simply ignored, whereas this is an
on-going effect to delete Python objects at exit (bpo-1635741).

== Boring traverse functions ==

Currently, there is no default traverse implementation which visits the type.

For example, I had the implement the following function for _thread.LockType:

static int
lock_traverse(lockobject self, visitproc visit, void arg)
{
    Py_VISIT(Py_TYPE(self));
    return 0;
}

It's a little bit annoying to have to implement the GC protocol
whereas a lock cannot contain other Python objects, it's not a
container. It's just a thin wrapper to a C lock.

There is exactly one strong reference: to the type.

== Workaround: loop on gc.collect() ==

A workaround is to run gc.collect() in a loop until it returns 0 (no
object was collected).

== Traverse automatically? Nope. ==

Pablo Galindo attempts to automatically visit the type in the traverse function:

https://bugs.python.org/issue40217
https://github.com/python/cpython/commit/0169d3003be3d072751dd14a5c84748ab63...

Moreover, What's New in Python 3.9 contains a long section suggesting
to implement a traverse function for this problem, but it doesn't
suggest to track instances:
https://docs.python.org/dev/whatsnew/3.9.html#changes-in-the-c-api

This solution causes too many troubles, and so instead, traverse
functions were defined on heap types to visit the type.

Currently in the master branch, 89 types are defined as heap types on
a total of 206 types (117 types are defined statically). I don't think
that these 89 heap types respect the 3 conditions to collaborate with
the GC.

== How should we address this issue? ==

I'm not sure what should be done. Working around the issue by
triggering multiple GC collections? Emit a warning in development mode
if a heap type doesn't collaborate well with the GC?

If core developers miss these bugs and have troubles to debug them, I
expect that extension module authors would suffer even more.

== GC+heap type bugs became common  ==

I'm fixing such GC issue for 1 year as part as the work on cleaning
Python objects at exit, and also indirectly related to
subinterpreters. The behavior is surprising, it's really hard to dig
into GC internals and understand what's going on. I wrote an article
on this kind of "GC bugs":
https://vstinner.github.io/subinterpreter-leaks.html

Today, I learnt the hard way that defining a traverse is not enough.
The type constructor (tp_new) must also track instances! See my fix
for _multibytecodec related to CJK codecs:

https://github.com/python/cpython/commit/11ef53aefbecfac18b63cee518a7184f771...
https://bugs.python.org/issue42866

== Reference cycles are common ==

The GC only serves to break reference cycles. But reference cycles are
rare, right? Well...

First of all, most types create reference cycles involing themselves.
For example, a type __mro__ tuple contains the type which already
creates a ref cycle. Type methods can also contain a reference to the
type.

=> The GC must break the cycle, otherwise the type cannot be destroyed

When a function is defined in a Python module, the function
__globals__ is the module namespace (module.__dict__) which...
contains the function. Defining a function in a Python module also
creates a reference cycle which prevents to delete the module
namespace.

If a function is used as a callback somewhere, the whole module
remains "alive" until the reference to the callback is cleared.
Example. os.register_at_fork() and codecs.register() callbacks are
cleared really late during Python finalization. Currently, it's
basically the last objects which are cleared at Python exit. After
that, there is exactly one final GC collection.

=> The GC

== Debug GC issues ==

    gc.get_referents() and gc.get_referrers() can be used to check traverse functions.
    gc.is_tracked() can be used to check if the GC tracks an object.
    Using the gdb debugger on gc_collect_main() helps to see which objects are collected. See for example the finalize_garbage() functions which calls finalizers on unreachable objects.
    The solution is usually a missing traverse functions or a missing Py_VISIT() in an existing traverse function.

== __del__ hack for debugging ==

If you want to play with the issue or if you have to debug a GC issue,
you can use an object which logs a message when it's being deleted:

class VerboseDel:
    def __del__(self):
        print("DELETE OBJECT")
obj = VerboseDel()

Warning: creating such object in a module also prevents to destroy the
module namespace when the last reference to the module is deleted!
__del__.__globals__ contains a reference to the module namespace, and
obj.__class__ contains a reference to the type... Yeah, ref cycle and
GC issues are fun!

== Long email ==

Yeah, I like to put titles in my long emails. Enjoy. Happy hacking!
Victor

--
Night gathers, and now my watch begins. It shall not end until my death
msg385299 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-19 21:56
In June 2020, I create PR 20983 to attempt to automatically traverse the type:
"Provide a default tp_traverse implementation for the base object
type for heap types which have no tp_traverse function. The
traverse function visits the type if the type is a heap type."

I abandoned my PR.

I marked bpo-41036 as a duplicate of this issue.
msg385883 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-01-28 20:45
Should we proceed with fixing GC for all heap types before continuing work with bpo-40077?
msg393547 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-12 19:51
I'm marking this as a 3.10 release blocker untill all converted types that are in 3.10 have GC support.
msg393609 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-13 21:34
I've added a checkbox for types that fully implement the GC protocol to https://discuss.python.org/t/list-of-built-in-types-converted-to-heap-types/8403/1.

Heap types that fully implement the GC protocol:
* _abc._abc_data
* _bz2.BZ2Compressor
* _bz2.BZ2Decompressor
* _csv.Dialect
* _csv.reader
* _csv.writer
* _json.Encoder
* _json.Scanner
* _lzma.LZMACompressor
* _lzma.LZMADecompressor
* _multibytecodec.MultibyteCodec
* _struct.unpack_iterator
* _thread._local
* _thread.lock
* ast.AST

Heap types that do not fully implement the GC protocol:
* _curses_panel.panel
* _dbm.dbm
* _gdbm.gdbm
* _hashlib.HASH
* _hashlib.HASHXOF
* _lsprof.Profiler
* _md5.md5
* _multibytecodec.MultibyteIncrementalDecoder
* _multibytecodec.MultibyteIncrementalEncoder
* _multibytecodec.MultibyteStreamReader
* _multibytecodec.MultibyteStreamWriter
* _overlapped.Overlapped
* _queue.SimpleQueue
* _random.Random
* _sha1.sha1
* _sha256.sha224
* _sha256.sha256
* _sha512.sha384
* _sha512.sha512
* _sre.SRE_Scanner
* _ssl.MemoryBIO
* _ssl.SSLSession
* _ssl._SSLContext
* _ssl._SSLSocket
* _struct.Struct
* _thread.RLock
* _thread._localdummy
* _tkinter.Tcl_Obj
* _tkinter.tkapp
* _tkinter.tktimertoken
* array.array
* array.arrayiterator
* functools.KeyWrapper
* functools._lru_cache_wrapper
* functools._lru_list_elem
* functools.partial
* mmap.mmap
* operator.attrgetter
* operator.itemgetter
* operator.methodcaller
* posix.DirEntry
* posix.ScandirIterator
* pyexpat.xmlparser
* re.Match
* re.Pattern
* select.devpoll
* select.epoll
* select.kevent
* select.kqueue
* select.poll
* sqlite3.Cache
* sqlite3.Connection
* sqlite3.Cursor
* sqlite3.Node
* sqlite3.PrepareProtocol
* sqlite3.Row
* sqlite3.Statement
* ssl.SSLError
* unicodedata.UCD
* winapi__overlapped.Overlapped
* zlib.Compress
* zlib.Decompress
msg393668 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-14 15:35
Is there a deterministic way to test these changes? Will something a la this be sufficient:

import gc
import sys

gc.collect()
before = sys.gettotalrefcount()

import somemod
del sys.modules['somemod']
del somemod

gc.collect()
after = sys.gettotalrefcount()

assert after == before
msg394383 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-25 17:44
New changeset d3c277a59c3d93fb92f7026f63678083d1d49fc5 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for sqlite3 heap types (GH-26104)
https://github.com/python/cpython/commit/d3c277a59c3d93fb92f7026f63678083d1d49fc5
msg394385 - (view) Author: miss-islington (miss-islington) Date: 2021-05-25 18:08
New changeset e8d9df0089e30a06d837fa2cfbd070e01531701f by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for sqlite3 heap types (GH-26104)
https://github.com/python/cpython/commit/e8d9df0089e30a06d837fa2cfbd070e01531701f
msg394386 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-25 18:26
New changeset bd404ccac0d3e8358995ac0cbeec9373bb6c4d96 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for arraymodule types (GH-26114)
https://github.com/python/cpython/commit/bd404ccac0d3e8358995ac0cbeec9373bb6c4d96
msg394388 - (view) Author: miss-islington (miss-islington) Date: 2021-05-25 18:49
New changeset 534da740a2586357d204ab5f446295b9ce220787 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for arraymodule types (GH-26114)
https://github.com/python/cpython/commit/534da740a2586357d204ab5f446295b9ce220787
msg394402 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-25 21:22
Christian, I've got a PR ready for Modules/_ssl.c, but I won't submit it if you'd rather do it yourself. I'll stay off the sha/md5 types unless you approve :)
msg394403 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-05-25 21:26
Please open PRs and assign them to me. I'll review them as soon as possible.
msg394404 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-25 22:06
Thanks! Hashlib PR comin' up.
msg394409 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-26 00:25
Victor, can you take a look at the opened PRs?
msg394416 - (view) Author: hai shi (shihai1991) * (Python triager) Date: 2021-05-26 06:11
> * functools._lru_list_elem
Looks like this type have performance in issue PR-5008 when supporting GC. I am not sure there have other similar questions or not.
msg394427 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-26 09:54
I've opened PR's to fix most of the heap types converted during the 3.10 alpha phase. What's missing (of the 3.10 batch) is:
- _thread types (this needs special care)
- winapi__overlapped.Overlapped (I currently don't have a Win dev env at hand)

For the types converted during 3.9 dev, should we backport to 3.9 or just 3.10?
msg394446 - (view) Author: Ken Jin (kj) * (Python triager) Date: 2021-05-26 16:36
_winapi is leaky still even with my PR:

>>> import sys,gc
>>> for _ in range(5):
...  print(sys.gettotalrefcount())
...  import _winapi
...  del sys.modules['_winapi']
...  del _winapi
...  gc.collect()
...
50468
51076
51432
51788
52144

I just noticed this, but _winapi doesn't have a m_traverse/m_clear/m_free in the PyModuleDef eventhough it creates nearly 100 objects in m_slot->Py_mod_exec. I'm not a multi phase init expert, but shouldn't there be a cleanup function or am I confusing something here :( ?
msg394450 - (view) Author: Ken Jin (kj) * (Python triager) Date: 2021-05-26 17:16
> it creates nearly 100 objects

Oops sorry I think I'm wrong. most of those objects may be borrowed refs. Just 1 type object is causing the leak.
msg394516 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-05-27 07:29
New changeset 59af59c2dfa52dcd5605185263f266a49ced934c by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully support GC for pyexpat, unicodedata, and dbm/gdbm heap types (GH-26376)
https://github.com/python/cpython/commit/59af59c2dfa52dcd5605185263f266a49ced934c
msg394517 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-05-27 07:48
New changeset 6ef5ba391d700bde7ec3ffd5fb7132a30dd309c4 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully support GC for hashlib heap types (GH-26374)
https://github.com/python/cpython/commit/6ef5ba391d700bde7ec3ffd5fb7132a30dd309c4
msg394518 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-05-27 07:50
New changeset dcb8786a9848516e823e090bb36079678913d8d3 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for ssl heap types (GH-26370)
https://github.com/python/cpython/commit/dcb8786a9848516e823e090bb36079678913d8d3
msg394519 - (view) Author: miss-islington (miss-islington) Date: 2021-05-27 08:11
New changeset 4431922f92747f77e3eb790c6d1881232e1b5e8c by Miss Islington (bot) in branch '3.10':
[3.10] bpo-42972: Fully support GC for hashlib heap types (GH-26374) (GH-26398)
https://github.com/python/cpython/commit/4431922f92747f77e3eb790c6d1881232e1b5e8c
msg394520 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-05-27 08:20
GH-26399 is failing with an access violation on Windows. It's failing in one of the flaky tests. I wonder if the segfault is related to flaky tests somehow...

https://dev.azure.com/Python/cpython/_build/results?buildId=81570&view=logs&j=c83831cd-3752-5cc7-2f01-8276919eb334

test_pha_optional (test.test_ssl.TestPostHandshakeAuth) ... ok
test_pha_optional_nocert (test.test_ssl.TestPostHandshakeAuth) ... ok
test_pha_required (test.test_ssl.TestPostHandshakeAuth) ... ok
Windows fatal exception: access violation

Current thread 0x000009e0 (most recent call first):
  File "D:\a\1\s\lib\linecache.py", line 63 in checkcache
  File "D:\a\1\s\lib\traceback.py", line 375 in extract
  File "D:\a\1\s\lib\traceback.py", line 494 in __init__
  File "D:\a\1\s\lib\traceback.py", line 132 in format_exception
  File "D:\a\1\s\lib\test\test_ssl.py", line 262 in handle_error
  File "D:\a\1\s\lib\test\test_ssl.py", line 2530 in run
  File "D:\a\1\s\lib\threading.py", line 1006 in _bootstrap_inner
  File "D:\a\1\s\lib\threading.py", line 963 in _bootstrap

Thread 0x000003c4 (most recent call first):
  File "D:\a\1\s\lib\threading.py", line 1102 in _wait_for_tstate_lock
  File "D:\a\1\s\lib\threading.py", line 1086 in join
  File "D:\a\1\s\lib\test\test_ssl.py", line 2604 in run
  File "D:\a\1\s\lib\threading.py", line 1006 in _bootstrap_inner
  File "D:\a\1\s\lib\threading.py", line 963 in _bootstrap

Thread 0x00001700 (most recent call first):
  File "D:\a\1\s\lib\ssl.py", line 1131 in read
  File "D:\a\1\s\lib\ssl.py", line 1256 in recv
  File "D:\a\1\s\lib\test\test_ssl.py", line 4471 in test_pha_required_nocert
  File "D:\a\1\s\lib\unittest\case.py", line 549 in _callTestMethod
  File "D:\a\1\s\lib\unittest\case.py", line 592 in run
  File "D:\a\1\s\lib\unittest\case.py", line 652 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\unittest\suite.py", line 122 in run
  File "D:\a\1\s\lib\unittest\suite.py", line 84 in __call__
  File "D:\a\1\s\lib\unittest\runner.py", line 176 in run
  File "D:\a\1\s\lib\test\support\__init__.py", line 959 in _run_suite
  File "D:\a\1\s\lib\test\support\__init__.py", line 1082 in run_unittest
  File "D:\a\1\s\lib\test\test_ssl.py", line 5007 in test_main
  File "D:\a\1\s\lib\test\libregrtest\runtest.py", line 246 in _runtest_inner2
  File "D:\a\1\s\lib\test\libregrtest\runtest.py", line 282 in _runtest_inner
  File "D:\a\1\s\lib\test\libregrtest\runtest.py", line 154 in _runtest
  File "D:\a\1\s\lib\test\__main__.py", line 2 in <module>
  File "D:\a\1\s\lib\runpy.py", line 86 in _run_code
  File "D:\a\1\s\lib\runpy.py", line 196 in _run_module_as_main
##[error]Cmd.exe exited with code '-1073741819'.
msg394532 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-27 10:19
Hm, I'm unable to reproduce it w/addr sanitiser on macOS (FWIW).

$ ./python.exe -m test test_ssl -F -u all -m test_pha_required_nocert

Passing 1000 successful runs now. I'll see if I can get a Win dev env set up later.
msg394541 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 13:56
So it seems like the weakref list (__weaklistoffset__) doesn't have to be handled by visit, clear or free functions, it only has to be deallocated with PyObject_ClearWeakRefs() in the dealloc function.

I noticed that when reviewing partial_clear(partialobject *pto) in PR 26363.
msg394547 - (view) Author: miss-islington (miss-islington) Date: 2021-05-27 15:26
New changeset 0bf0500baa4cbdd6c5668461c2a2a008121772be by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully support GC for pyexpat, unicodedata, and dbm/gdbm heap types (GH-26376)
https://github.com/python/cpython/commit/0bf0500baa4cbdd6c5668461c2a2a008121772be
msg394554 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 15:50
New changeset 4d7f8f9f7fb09ea8eb4e43409a16a91b0bf18571 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully support GC protocol for _queue.SimpleQueue (GH-26372)
https://github.com/python/cpython/commit/4d7f8f9f7fb09ea8eb4e43409a16a91b0bf18571
msg394555 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 15:54
New changeset 318adeba780851c416505e48a3454cacca831419 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully support GC for mmap heap types (GH-26373)
https://github.com/python/cpython/commit/318adeba780851c416505e48a3454cacca831419
msg394562 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 16:25
New changeset ea47a8a71ad56ec349f02bf8c6a1d3bf04acabcc by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for ssl heap types (GH-26370) (GH-26399)
https://github.com/python/cpython/commit/ea47a8a71ad56ec349f02bf8c6a1d3bf04acabcc
msg394564 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 16:25
New changeset e73b3b1cd48c92d847990e220cb9cbdbde86476a by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully support GC protocol for _queue.SimpleQueue (GH-26372) (GH-26406)
https://github.com/python/cpython/commit/e73b3b1cd48c92d847990e220cb9cbdbde86476a
msg394566 - (view) Author: miss-islington (miss-islington) Date: 2021-05-27 16:44
New changeset da8097aaf5a55c23f5b5ddbeffc2d90d06e00d93 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully support GC for mmap heap types (GH-26373)
https://github.com/python/cpython/commit/da8097aaf5a55c23f5b5ddbeffc2d90d06e00d93
msg394568 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 16:58
I'm not sure that it's safe to call PyObject_ClearWeakRefs() in tp_clear function. PyObject_ClearWeakRefs() comment starts with:

"This function is called by the tp_dealloc handler to clear weak references."

For example in Modules/arraymodule.c, I don't understand well what assigns weakreflist and what is the object type. Is it a strong reference to a Python object? What is supposed to call Py_DECREF() on it?

Is it enought to call PyObject_ClearWeakRefs() in tp_dealloc?

subtype_dealloc() calls PyObject_ClearWeakRefs(self).

I wrote a short example:
---
import weakref
import ctypes
import sys

class A:
    pass

obj=A()
assert obj.__weakref__ is None

wr1 = weakref.ref(obj)
assert obj.__weakref__ is wr1
print(type(wr1))
print("refcnt(wr1)", sys.getrefcount(wr1))

wr2 = weakref.ref(obj)
assert wr2 is wr1
assert obj.__weakref__ is wr1
print("refcnt(wr1)", sys.getrefcount(wr1))

_PyWeakref_GetWeakrefCount = ctypes.pythonapi._PyWeakref_GetWeakrefCount
_PyWeakref_GetWeakrefCount.argtypes = (ctypes.py_object,)
_PyWeakref_GetWeakrefCount.restype = ctypes.c_size_t
print("_PyWeakref_GetWeakrefCount:", _PyWeakref_GetWeakrefCount(wr1))
---

Output:
---
<class 'weakref'>
refcnt(wr1) 2
refcnt(wr1) 3
_PyWeakref_GetWeakrefCount: 1
---

In this case, wr2 is wr1, __weakreflist__ points to a weakref.ref object instance, and _PyWeakref_GetWeakrefCount() returns 1.

At the first weakref.ref() call, the reference count is 1: "wr1" variable holds this reference. I understand that obj stores a *weak* reference to the Python object "weakref.ref". So it doesn't have to DECREF anything.
msg394574 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 17:15
> GH-26399 is failing with an access violation on Windows. It's failing in one of the flaky tests. I wonder if the segfault is related to flaky tests somehow...

I created bpo-44252 to track this crash, it might be unrelated to commit dcb8786a9848516e823e090bb36079678913d8d3. Even if it's related, I prefer to track it separated to ease collaboration and focus to this issue to the GC protocol.
msg394575 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 17:23
New changeset fba42d11880f444bb94d9891e3949f082a57b9a9 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for re types (GH-26368)
https://github.com/python/cpython/commit/fba42d11880f444bb94d9891e3949f082a57b9a9
msg394583 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-27 18:35
Please, check also the discusion happening here:

https://mail.python.org/archives/list/python-committers@python.org/thread/FHFI7QKWNHAVWVFTCHJGTYD3ZFVEUXDD/
msg394594 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-27 20:34
I can't contribute to that discussion, as it is on the committers ml, but I'll keep an eye on it. I'll refrain from further development on this issue until there's a consensus amongst the core devs.
msg394601 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 20:59
New changeset f4b70c22c8e37dd7a06702e30b121a6651683421 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully support GC protocol for _operator heap types (GH-26371)
https://github.com/python/cpython/commit/f4b70c22c8e37dd7a06702e30b121a6651683421
msg394609 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-27 21:50
New changeset d1c732912e20e89815aca2d986442d349e82e31f by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully support GC protocol for _operator heap types (GH-26371) (GH-26413)
https://github.com/python/cpython/commit/d1c732912e20e89815aca2d986442d349e82e31f
msg394622 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 00:23
New changeset da9e0cb4dedf818540ad1f06305bb1ca9e568f51 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for re types (GH-26368) (GH-26414)
https://github.com/python/cpython/commit/da9e0cb4dedf818540ad1f06305bb1ca9e568f51
msg394643 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 08:41
New changeset 8994e9c2cd775ddf7b0723824da53fe0d7c039ac by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for functools keywrapper and partial types (GH-26363)
https://github.com/python/cpython/commit/8994e9c2cd775ddf7b0723824da53fe0d7c039ac
msg394645 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 09:02
New changeset 3f8d33252722750e6c019d3df7ce0fabf7bdd45e by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fully implement GC protocol for functools LRU cache (GH-26423)
https://github.com/python/cpython/commit/3f8d33252722750e6c019d3df7ce0fabf7bdd45e
msg394646 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 09:06
New changeset 0fa282c55f1a45765340cb24ed65c90ffe2aa405 by Ken Jin in branch 'main':
bpo-42972: Fully support GC for _winapi.Overlapped (GH-26381)
https://github.com/python/cpython/commit/0fa282c55f1a45765340cb24ed65c90ffe2aa405
msg394647 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 09:08
New changeset eb8ab04dd7fe6bb9a4eefb5e60d7b6ca887e0148 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for functools keywrapper and partial types (GH-26363) (GH-26424)
https://github.com/python/cpython/commit/eb8ab04dd7fe6bb9a4eefb5e60d7b6ca887e0148
msg394658 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 13:58
I'm not fully satisfied by the implementation of the two LRU types in functools. See the discussion in these two PRs:

https://github.com/python/cpython/pull/26363
https://github.com/python/cpython/pull/26423

The _lru_list_elem doesnt implement the GC protocol for performance reasons:

* https://bugs.python.org/issue32422
* https://github.com/python/cpython/pull/5008/files

But I'm not sure if it's ok that _lru_list_elem doesn't implement the GC protocol: it's disucssion in this issue.
msg394659 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 14:01
I'm not fully satified by the overlapped_dealloc() implementation neither. There is an unpleasant code path for Windows XP but it's no longer needed in Python 3.11. I would prefer to always call PyObject_GC_UnTrack() and call it earlier.

See the dicsussion in the PR:
https://github.com/python/cpython/pull/26381

But it can be modified later.
msg394662 - (view) Author: miss-islington (miss-islington) Date: 2021-05-28 14:26
New changeset 1c0106ca8c72d671ad4e2b553489d786d06fce03 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for functools LRU cache (GH-26423)
https://github.com/python/cpython/commit/1c0106ca8c72d671ad4e2b553489d786d06fce03
msg394669 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-28 16:29
New changeset 490b638e63558013b71dbfba6e47cb9e6d80c911 by Ken Jin in branch 'main':
bpo-42972: Fix GC assertion error in _winapi by untracking Overlapped earlier (GH(26429)
https://github.com/python/cpython/commit/490b638e63558013b71dbfba6e47cb9e6d80c911
msg394676 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-28 17:47
New changeset 0d399516320d8dfce4453037338659cef3a2adf4 by Ken Jin in branch '3.10':
[3.10] bpo-42972: Fully support GC for _winapi.Overlapped (GH-26381)  (#26430)
https://github.com/python/cpython/commit/0d399516320d8dfce4453037338659cef3a2adf4
msg394789 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-31 07:51
New changeset 4b20f2574d412f4c4a5b1ab799d8e71a5dd3b766 by Hai Shi in branch 'main':
bpo-42972: Fully implement GC protocol for xxlimited (GH-26451)
https://github.com/python/cpython/commit/4b20f2574d412f4c4a5b1ab799d8e71a5dd3b766
msg394795 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-31 08:25
New changeset d1124b09e8251061dc040cbd396f35ae57783f4a by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Fix sqlite3 traverse/clear functions (GH-26452)
https://github.com/python/cpython/commit/d1124b09e8251061dc040cbd396f35ae57783f4a
msg394797 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-31 09:12
New changeset ff359d735f1a60878975d1c5751bfd2361e84067 by Miss Islington (bot) in branch '3.10':
bpo-42972: Fix sqlite3 traverse/clear functions (GH-26452) (GH-26461)
https://github.com/python/cpython/commit/ff359d735f1a60878975d1c5751bfd2361e84067
msg394801 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-31 11:23
New changeset f097d2302be46b031687726011b86fc241a042ef by Miss Islington (bot) in branch '3.10':
bpo-42972: Fully implement GC protocol for xxlimited (GH-26451) (GH-26460)
https://github.com/python/cpython/commit/f097d2302be46b031687726011b86fc241a042ef
msg394832 - (view) Author: Soffie Swan (soffieswan015) Date: 2021-06-01 00:30

Message has been classified as spam.

msg394852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-01 10:47
New changeset fffa0f92adaaed0bcb3907d982506f78925e9052 by Erlend Egeberg Aasland in branch 'main':
bpo-42972: Track sqlite3 statement objects (GH-26475)
https://github.com/python/cpython/commit/fffa0f92adaaed0bcb3907d982506f78925e9052
msg395015 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-03 16:38
New changeset 84d80f5f30b1f545083c70a7d4e1e79ab75f9fa6 by Erlend Egeberg Aasland in branch '3.10':
[3.10] bpo-42972: Track sqlite3 statement objects (GH-26475) (GH-26515)
https://github.com/python/cpython/commit/84d80f5f30b1f545083c70a7d4e1e79ab75f9fa6
msg395878 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-15 13:09
New changeset 1cd3d859a49b047dd08abb6f44f0539564d3525a by Victor Stinner in branch 'main':
bpo-42972: _thread.RLock implements the GH protocol (GH-26734)
https://github.com/python/cpython/commit/1cd3d859a49b047dd08abb6f44f0539564d3525a
msg395879 - (view) Author: miss-islington (miss-islington) Date: 2021-06-15 13:29
New changeset e30fe27dabbc6b48736c3c17d57f6fa542376e8f by Miss Islington (bot) in branch '3.10':
bpo-42972: _thread.RLock implements the GH protocol (GH-26734)
https://github.com/python/cpython/commit/e30fe27dabbc6b48736c3c17d57f6fa542376e8f
History
Date User Action Args
2021-06-15 13:29:51miss-islingtonsetmessages: + msg395879
2021-06-15 13:09:37miss-islingtonsetpull_requests: + pull_request25322
2021-06-15 13:09:32vstinnersetmessages: + msg395878
2021-06-15 11:52:11vstinnersetpull_requests: + pull_request25321
2021-06-04 22:57:44pablogsalsetpriority: release blocker ->
2021-06-03 16:38:18vstinnersetmessages: + msg395015
2021-06-03 15:26:11erlendaaslandsetpull_requests: + pull_request25110
2021-06-01 10:47:41vstinnersetmessages: + msg394852
2021-06-01 00:41:32pablogsalsetfiles: - DTD-Matching-Gift-Dashboard - Google Sheets.html
2021-06-01 00:30:39soffieswan015setfiles: + DTD-Matching-Gift-Dashboard - Google Sheets.html

nosy: + soffieswan015
messages: + msg394832

type: behavior
2021-05-31 22:49:10erlendaaslandsetpull_requests: + pull_request25070
2021-05-31 11:23:22pablogsalsetmessages: + msg394801
2021-05-31 09:12:42vstinnersetmessages: + msg394797
2021-05-31 08:25:05miss-islingtonsetpull_requests: + pull_request25056
2021-05-31 08:25:04vstinnersetmessages: + msg394795
2021-05-31 07:51:50vstinnersetmessages: + msg394789
2021-05-31 07:51:47miss-islingtonsetpull_requests: + pull_request25055
2021-05-29 18:29:22erlendaaslandsetpull_requests: + pull_request25046
2021-05-29 14:33:49shihai1991setpull_requests: + pull_request25045
2021-05-28 17:47:25pablogsalsetmessages: + msg394676
2021-05-28 16:29:26vstinnersetmessages: + msg394669
2021-05-28 14:26:24miss-islingtonsetmessages: + msg394662
2021-05-28 14:10:01miss-islingtonsetpull_requests: + pull_request25025
2021-05-28 14:01:56vstinnersetmessages: + msg394659
2021-05-28 13:58:46vstinnersetmessages: + msg394658
2021-05-28 12:02:05kjsetpull_requests: + pull_request25024
2021-05-28 11:34:35kjsetpull_requests: + pull_request25023
2021-05-28 09:42:15miss-islingtonsetpull_requests: + pull_request25022
2021-05-28 09:08:05vstinnersetmessages: + msg394647
2021-05-28 09:06:54miss-islingtonsetpull_requests: + pull_request25021
2021-05-28 09:06:53vstinnersetmessages: + msg394646
2021-05-28 09:02:51miss-islingtonsetpull_requests: + pull_request25020
2021-05-28 09:02:50vstinnersetmessages: + msg394645
2021-05-28 08:41:45vstinnersetmessages: + msg394643
2021-05-28 08:41:26miss-islingtonsetpull_requests: + pull_request25019
2021-05-28 08:25:02erlendaaslandsetpull_requests: + pull_request25018
2021-05-28 00:23:52vstinnersetmessages: + msg394622
2021-05-27 21:59:10miss-islingtonsetpull_requests: + pull_request25008
2021-05-27 21:50:03vstinnersetmessages: + msg394609
2021-05-27 20:59:23miss-islingtonsetpull_requests: + pull_request25007
2021-05-27 20:59:14vstinnersetmessages: + msg394601
2021-05-27 20:34:44erlendaaslandsetmessages: + msg394594
2021-05-27 18:35:59pablogsalsetmessages: + msg394583
2021-05-27 17:23:26vstinnersetmessages: + msg394575
2021-05-27 17:23:15miss-islingtonsetpull_requests: + pull_request25005
2021-05-27 17:15:07vstinnersetmessages: + msg394574
2021-05-27 16:58:40vstinnersetmessages: + msg394568
2021-05-27 16:44:03miss-islingtonsetmessages: + msg394566
2021-05-27 16:25:58vstinnersetmessages: + msg394564
2021-05-27 16:25:31vstinnersetmessages: + msg394562
2021-05-27 15:54:08miss-islingtonsetpull_requests: + pull_request25000
2021-05-27 15:54:07vstinnersetmessages: + msg394555
2021-05-27 15:50:21miss-islingtonsetpull_requests: + pull_request24999
2021-05-27 15:50:21vstinnersetmessages: + msg394554
2021-05-27 15:26:22miss-islingtonsetmessages: + msg394547
2021-05-27 13:56:19vstinnersetmessages: + msg394541
2021-05-27 10:19:02erlendaaslandsetmessages: + msg394532
2021-05-27 08:20:03christian.heimessetmessages: + msg394520
2021-05-27 08:11:01miss-islingtonsetmessages: + msg394519
2021-05-27 07:50:18miss-islingtonsetpull_requests: + pull_request24992
2021-05-27 07:50:13christian.heimessetmessages: + msg394518
2021-05-27 07:48:35christian.heimessetmessages: + msg394517
2021-05-27 07:48:28miss-islingtonsetpull_requests: + pull_request24991
2021-05-27 07:29:15ncoghlansetnosy: + ncoghlan
messages: + msg394516
2021-05-27 07:29:09miss-islingtonsetpull_requests: + pull_request24990
2021-05-26 17:16:20kjsetmessages: + msg394450
2021-05-26 16:36:28kjsetmessages: + msg394446
2021-05-26 16:03:19kjsetnosy: + kj
pull_requests: + pull_request24973
2021-05-26 09:54:21erlendaaslandsetmessages: + msg394427
2021-05-26 09:48:45erlendaaslandsetpull_requests: + pull_request24968
2021-05-26 06:11:58shihai1991setmessages: + msg394416
2021-05-26 00:25:38pablogsalsetmessages: + msg394409
2021-05-25 23:59:13naschemesetnosy: + nascheme
2021-05-25 22:14:55erlendaaslandsetpull_requests: + pull_request24966
2021-05-25 22:06:20erlendaaslandsetmessages: + msg394404
2021-05-25 21:51:17erlendaaslandsetpull_requests: + pull_request24964
2021-05-25 21:49:33erlendaaslandsetpull_requests: + pull_request24963
2021-05-25 21:31:49erlendaaslandsetpull_requests: + pull_request24960
2021-05-25 21:28:13erlendaaslandsetpull_requests: + pull_request24959
2021-05-25 21:26:26christian.heimessetmessages: + msg394403
2021-05-25 21:22:28erlendaaslandsetnosy: + christian.heimes
messages: + msg394402
2021-05-25 20:40:28erlendaaslandsetpull_requests: + pull_request24957
2021-05-25 20:14:04erlendaaslandsetpull_requests: + pull_request24954
2021-05-25 18:49:27miss-islingtonsetmessages: + msg394388
2021-05-25 18:26:56pablogsalsetmessages: + msg394386
2021-05-25 18:26:52miss-islingtonsetpull_requests: + pull_request24953
2021-05-25 18:08:47miss-islingtonsetmessages: + msg394385
2021-05-25 17:44:06pablogsalsetmessages: + msg394383
2021-05-25 17:44:05miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24952
2021-05-14 15:35:07erlendaaslandsetmessages: + msg393668
2021-05-13 22:26:54erlendaaslandsetpull_requests: - pull_request23226
2021-05-13 22:12:35erlendaaslandsetpull_requests: + pull_request24759
2021-05-13 21:34:43erlendaaslandsetmessages: + msg393609
2021-05-13 17:29:00erlendaaslandsetpull_requests: + pull_request24746
2021-05-12 19:51:43pablogsalsetpriority: normal -> release blocker
nosy: + pablogsal
messages: + msg393547

2021-02-01 08:34:15erlendaaslandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23226
2021-01-28 20:45:54erlendaaslandsetmessages: + msg385883
2021-01-21 04:34:28shihai1991setnosy: + shihai1991
2021-01-20 20:23:51erlendaaslandsetnosy: + erlendaasland
2021-01-20 14:06:43corona10setnosy: + corona10
2021-01-19 21:56:39vstinnersetmessages: + msg385299
2021-01-19 21:55:23vstinnerlinkissue41036 superseder
2021-01-19 21:54:26vstinnercreate