classification
Title: Better support for finalization with weakrefs
Type: enhancement Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: sbt Nosy List: asvetlov, haypo, jcea, lukasz.langa, pitrou, python-dev, sbt, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-08-01 16:18 by sbt, last changed 2013-06-08 16:54 by sbt. This issue is now closed.

Files
File name Uploaded Description Edit
finalize.patch sbt, 2012-08-01 16:18 review
finalize.patch sbt, 2012-08-05 13:55 review
finalize.patch sbt, 2012-08-06 12:35 review
finalize.patch sbt, 2012-08-17 13:07 review
finalize.patch sbt, 2012-08-19 19:38 review
finalize.patch sbt, 2013-05-05 19:33 review
Messages (28)
msg167142 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-01 16:18
A patch with docs and tests for the idea I suggested on python-ideas:

    http://comments.gmane.org/gmane.comp.python.ideas/15863

To repeat what I wrote there, the current issues with weakref callbacks include:

1. They are rather low level, and working out how to use them
   correctly requires a bit of head scratching.  One must find
   somewhere to store the weakref till after the referent is dead, and
   without accidentally keeping the referent alive.  Then one must
   ensure that the callback frees the weakref (without leaving any
   remnant ref-cycles).

   When it is an option, using a __del__ method is far less hassle.

2. Callbacks invoked during interpreter shutdown are troublesome.  For
   instance they can throw exceptions because module globals have been
   replaced by None.

3. Sometimes you want the callback to be called at program exit even
   if the referent is still alive.

4. Exceptions thrown in callbacks do not produce a traceback.  This
   can make debugging rather awkward.  (Maybe this is because printing
   tracebacks is problematic during interpreter shutdown?)

(Note that problems 2-4 also apply to using __del__ methods.)

Trivial usage of the class looks like

    >>> class Kenny: pass
    ...
    >>> kenny = Kenny()
    >>> finalize(kenny, print, "you killed kenny!")
    <finalize.finalize object at 0xffed4f2c>
    >>> del kenny
    you killed kenny!
msg167419 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-08-04 16:39
I don't quite understand the purpose of your suggestions. What can you do with it help, what you can not do with contextlib.ExitStack, atexit, __del__ method, weakref.WeakKeyDictionary or weakref.ref? I read the documentation, but the meaning eludes me.
msg167431 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-04 19:24
> I don't quite understand the purpose of your suggestions. What can you do 
> with it help, what you can not do with contextlib.ExitStack, atexit, 
> __del__ method, weakref.WeakKeyDictionary or weakref.ref? I read the 
> documentation, but the meaning eludes me.

finalize does not "compete" with contextlib.ExitStack, atexit and
weakref.WeakKeyDictionary.  It only competes with __del__ and weakref
callbacks.

Points 1 and 2 in my first message are the main points.  Also, read the
warning at

  http://docs.python.org/py3k/reference/datamodel.html#object.__del__

which also applies to weakref callbacks.

Other problems with __del__:

* Ref cycles which contain an object with a __del__ method are immortal

* __del__ methods can "ressurect" the object.

There was actually a proposal to remove or replace __del__ methods in
Python 3000.  See the "Removing __del__" thread(s):

  #3797">http://mail.python.org/pipermail/python-3000/2006-September/thread.html#3797

As for weakref callbacks, I think they are just too difficult to use correctly
unless you are very familiar with them.
msg167437 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-08-04 20:36
> finalize does not "compete" with contextlib.ExitStack, atexit and
> weakref.WeakKeyDictionary.  It only competes with __del__ and weakref
> callbacks.

What kind of problems you solve with __del__  and weakref callbacks? For most 
of them contextlib.ExitStack and atexit are safer and better fit.

> Points 1 and 2 in my first message are the main points.  Also, read the
> warning at

For point 1: global weakref.WeakKeyDictionary is good store for weak refs with 
callbacks. 

global weakdict
weakdict[kenny] = weakref.ref(kenny, lambda _: print("you killed kenny!"))

If you do not want to work at a low level, in most cases fit the above-
mentioned high-level tools.

For point 2 Antoine has noted that it is a known issue and there is a solution 
(issue812369).

>   http://docs.python.org/py3k/reference/datamodel.html#object.__del__
> 
> which also applies to weakref callbacks.

Is this your point 2?

> Other problems with __del__:
> 
> * Ref cycles which contain an object with a __del__ method are immortal

It is use case for weakrefs which break reference loops.

> * __del__ methods can "ressurect" the object.

What you are meaning?

Relying on GC is dangerous thing for resource releasing. And it even more 
dangerous for alternative implementations without reference counting. That is 
why in recent times in Python there has been a tendency to RAII strategy 
(context managers).

Can you give examples, where the use of finalize necessary or more convenient 
the other ways?
msg167458 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-04 22:58
> For point 1: global weakref.WeakKeyDictionary is good store for weak refs with 
> callbacks. 
>
> global weakdict
> weakdict[kenny] = weakref.ref(kenny, lambda _: print("you killed kenny!"))

That depends on kenny being hashable.

It also surprises me a bit that it works.  It seems to depend on unguaranteed
implementation details: PyObject_ClearWeakRefs() copies all weakrefs with 
callbacks to a tuple before invoking callbacks.  If this were not the case
then I think the weakref you created (which is scheduled to fire second) would
be garbage collected before its callback could be invoked.

I think users should have a documented way to schedule callbacks without having to
explicitly save the weakref.

> For point 2 Antoine has noted that it is a known issue and there is a solution 
> (issue812369).

That has been languishing for 9 years.  I would not hold your breath.

> >   http://docs.python.org/py3k/reference/datamodel.html#object.__del__
> > 
> > which also applies to weakref callbacks.
> 
> Is this your point 2?

Yes.

> Relying on GC is dangerous thing for resource releasing. And it even more 
> dangerous for alternative implementations without reference counting. That is 
> why in recent times in Python there has been a tendency to RAII strategy 
> (context managers).

I agree that explicit clean up using context managers to be strongly encouraged.
But resources should still be released if the object is garbage collected.  Or
do you think that file objects should stop bothering to close their fds when they
are deallocated?

> Can you give examples, where the use of finalize necessary or more convenient 
> the other ways?

multiprocessing (which I orginally wrote) makes extensive use of finalizers, 
(although with a different implementation and api).  In some cases that is because
at the time I wanted to support versions of Python which did not have the with
statement.

But there are various things there that depend on finalizers, and on being able
to guarantee that they get called on exit (in a predictable order).
msg167503 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-05 13:55
Updated patch.

get() and put() replaced by peek() and detach().  Added isalive().  Now finalizer class has no state, i.e. __slots__ == ().
msg167507 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-05 17:03
I don't understand why you have two names: finalize and Finalizer. Let's keep only one of them.

isalive() could be a read-only property named "alive" instead.

In _make_callback, I still think the default error reporting mechanism should be kept. It can be improved separately.
msg167556 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-06 12:35
> In _make_callback, I still think the default error reporting mechanism 
> should be kept. It can be improved separately.

New patch.  This time I have got rid of _make_callback, and just given __call__ an ignored optional argument.
msg167952 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-11 12:24
In the latest patch, what are "broken" and "priority" for?

Also, I would question why atexit is false by default. I would find it personally less surprising to be true.
msg167998 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-11 19:17
> In the latest patch, what are "broken" and "priority" for?

They are for a subclass used by multiprocessing.  But it is premature to worry about subclassing, so I will take them out.

> Also, I would question why atexit is false by default. I would find it 
> personally less surprising to be true.

OK.

One thing I do worry about is having the loop in the exit function to run any finalizers created during the exit function.  The current implementation will run these extra finalizers at the wrong time.  Fixing this could probably be done by using a dirty flag and disabling gc while running the finalizers.

I wonder if it would be better to not call finalizers created during the exit function.  We cannot guarantee that every finalizer created during shutdown is run, so is a best effort attempt really worth the effort?
msg168442 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-17 13:07
Updated patch.
msg168592 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-08-19 19:38
New patch.
msg172077 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-05 13:03
In:

+                    except:
+                        sys.excepthook(*sys.exc_info())

I would write "except Exception" instead. You don't want to trap e.g. KeyboardInterrupt.

For clarity, I would also add "_dirty = False" at the finalize top-level.
Otherwise, looks fine to me.
msg182083 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-14 09:21
Richard, do you still want to push this forward? Otherwise I'd like to finalize the patch (in the other sense ;-).
msg182092 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-14 12:01
> Richard, do you still want to push this forward? Otherwise I'd like to 
> finalize the patch (in the other sense ;-).

I started to worry a bit about daemon threads.  I think they can still run while atexit functions are being run.*  So if a daemon thread creates an atexit finalizer during shutdown it may never be run.

I am not sure how much to worry about this potential race.  Maybe a lock could be used to cause any daemon threads which try to create finalizers to block.

* Is it necessary/desirable to allow daemon threads to run during shutdown.  Maybe blocking thread switching at shutdown could cause deadlocks?
msg182100 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-14 15:16
> Is it necessary/desirable to allow daemon threads to run during
> shutdown.  Maybe blocking thread switching at shutdown could cause
> deadlocks?

Mmmh... thread switching is already blocked at shutdown:
http://hg.python.org/cpython/file/0f827775f7b7/Python/ceval.c#l434
msg182104 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-14 15:45
On 14/02/2013 3:16pm, Antoine Pitrou wrote:
> Mmmh... thread switching is already blocked at shutdown:
> http://hg.python.org/cpython/file/0f827775f7b7/Python/ceval.c#l434

But in Py_Finalize(), call_py_exitfuncs() is called *before* _Py_Finalizing is set to a non-NULL value.

    http://hg.python.org/cpython/file/0f827775f7b7/Python/pythonrun.c#l492
msg182123 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-14 22:32
> But in Py_Finalize(), call_py_exitfuncs() is called *before*
> _Py_Finalizing is set to a non-NULL value.
> 
> 
> http://hg.python.org/cpython/file/0f827775f7b7/Python/pythonrun.c#l492

Ah, right. But is it any different from, e.g., registering an atexit
handler from a daemon thread?
In any case, I think it's just something we'll have to live with. Daemon
threads are not a terrific idea in general.
msg182124 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-02-14 23:21
> In any case, I think it's just something we'll have to live with. Daemon
> threads are not a terrific idea in general.

I agree.
msg188455 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-05 19:33
Here is an updated patch.  It is only really the example in the docs which is different, plus a note about daemon threads.

Antoine, do think this is ready to be committed?
msg188457 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-05 19:38
Well, assuming there are no significant changes in the code (I haven't checked), +1 for committing.
msg188459 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-05-05 19:43
The only (non-doc, non-comment) changes were the two one-liners you suggested in msg172077.  So I will commit.
msg188465 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-05 21:03
New changeset 2e446e87ac5b by Richard Oudkerk in branch 'default':
Issue #15528: Add weakref.finalize to support finalization using
http://hg.python.org/cpython/rev/2e446e87ac5b
msg188466 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-05-05 21:10
The changeset 2e446e87ac5b broke some buildbots at compile step.

./python -E -S -m sysconfig --generate-posix-vars
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Fatal Python error: Py_Initialize: can't initialize sys standard streams
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.coghlan-redhat/build/Lib/locale.py", line 17, in <module>
    import re
  File "/home/buildbot/buildarea/3.x.coghlan-redhat/build/Lib/re.py", line 124, in <module>
    import functools
  File "/home/buildbot/buildarea/3.x.coghlan-redhat/build/Lib/functools.py", line 18, in <module>
    from collections import namedtuple
  File "/home/buildbot/buildarea/3.x.coghlan-redhat/build/Lib/collections/__init__.py", line 8, in <module>
    __all__ += collections.abc.__all__
AttributeError: 'module' object has no attribute 'abc'

http://buildbot.python.org/all/builders/x86%20RHEL%206%203.x/builds/1963
msg188474 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-05 22:12
New changeset 186cf551dae5 by Richard Oudkerk in branch 'default':
Issue #15528: Add weakref.finalize to support finalization using
http://hg.python.org/cpython/rev/186cf551dae5
msg190798 - (view) Author: Ɓukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-08 09:46
Your patch leaks references with subinterpreters, which was exposed by functools.singledispatch using weakref. While the actual bug is in atexit (which doesn't properly unregister on subinterpreter termination), now all uses of weakref leak.

Context: http://mail.python.org/pipermail/python-dev/2013-June/126755.html

PJE suggests importing atexit and registering finalize only when it's actually used. I guess this would be the easiest workaround.
msg190802 - (view) Author: Roundup Robot (python-dev) Date: 2013-06-08 15:53
New changeset d6ad9d7468f7 by Richard Oudkerk in branch 'default':
Issue #15528: Delay importing atexit until weakref.finalize() used.
http://hg.python.org/cpython/rev/d6ad9d7468f7
msg190809 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-06-08 16:54
> PJE suggests importing atexit and registering finalize only when it's 
> actually used. I guess this would be the easiest workaround.

Done.
History
Date User Action Args
2013-06-08 16:54:33sbtsetstatus: open -> closed

messages: + msg190809
2013-06-08 15:53:59python-devsetmessages: + msg190802
2013-06-08 09:46:05lukasz.langasetstatus: closed -> open

nosy: + lukasz.langa
messages: + msg190798

assignee: sbt
resolution: fixed ->
2013-05-05 23:42:53sbtsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2013-05-05 22:12:09python-devsetmessages: + msg188474
2013-05-05 21:10:19hayposetnosy: + haypo
messages: + msg188466
2013-05-05 21:03:00python-devsetnosy: + python-dev
messages: + msg188465
2013-05-05 19:43:02sbtsetmessages: + msg188459
2013-05-05 19:38:07pitrousetmessages: + msg188457
2013-05-05 19:33:25sbtsetfiles: + finalize.patch

messages: + msg188455
2013-02-14 23:21:00sbtsetmessages: + msg182124
2013-02-14 22:32:57pitrousetmessages: + msg182123
2013-02-14 15:45:23sbtsetmessages: + msg182104
2013-02-14 15:16:59pitrousetmessages: + msg182100
2013-02-14 12:01:34sbtsetmessages: + msg182092
2013-02-14 09:21:10pitrousetmessages: + msg182083
2012-10-05 13:03:11pitrousetmessages: + msg172077
2012-10-04 10:47:18jceasetnosy: + jcea
2012-08-19 19:38:40sbtsetfiles: + finalize.patch

messages: + msg168592
2012-08-17 13:07:56sbtsetfiles: + finalize.patch

messages: + msg168442
2012-08-11 19:17:38sbtsetmessages: + msg167998
2012-08-11 12:24:18pitrousetmessages: + msg167952
2012-08-10 13:32:00asvetlovsetnosy: + asvetlov
2012-08-06 12:35:35sbtsetfiles: + finalize.patch

messages: + msg167556
2012-08-05 17:03:17pitrousetnosy: + pitrou
messages: + msg167507
2012-08-05 13:55:04sbtsetfiles: + finalize.patch

messages: + msg167503
2012-08-04 22:58:30sbtsetmessages: + msg167458
2012-08-04 20:36:08serhiy.storchakasetmessages: + msg167437
2012-08-04 19:24:30sbtsetmessages: + msg167431
2012-08-04 16:39:26serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg167419
2012-08-01 16:18:46sbtcreate