classification
Title: Refleak tests: test_doctest and test_gc are failing
Type: behavior Stage: patch review
Components: Interpreter Core, Tests, Windows Versions: Python 3.3
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: tim.peters Nosy List: BreamoreBoy, Jeremy.Hylton, amaury.forgeotdarc, brian.curtin, christian.heimes, ezio.melotti, gvanrossum, pitrou, tim.peters
Priority: normal Keywords: patch

Created on 2007-12-02 14:41 by christian.heimes, last changed 2013-10-23 14:40 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
gc_bug.py amaury.forgeotdarc, 2007-12-05 16:51
gcmodule.c.patch Jeremy.Hylton, 2010-02-19 20:58 Restructure GC to look for resurrected objects
Messages (17)
msg58089 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-12-02 14:41
I've seen the problem on Windows only. test_doctest fails and the
problem also causes test_gc to fail when it is run after test_doctest.
W/o a prior run of test_doctest test_gc doesn't fail.

File "c:\dev\python\py3k\lib\test\test_doctest.py", line 1570, in
test.test_doct
est.test_debug
Failed example:
    try: doctest.debug_src(s)
    finally: sys.stdin = real_stdin
Expected:
    > <string>(1)<module>()
    (Pdb) next
    12
    --Return--
    > <string>(1)<module>()->None
    (Pdb) print(x)
    12
    (Pdb) continue
Got:
    > c:\dev\python\py3k\lib\io.py(281)__del__()
    -> try:
    (Pdb) next
    > c:\dev\python\py3k\lib\io.py(282)__del__()
    -> self.close()
    (Pdb) print(x)
    *** NameError: NameError("name 'x' is not defined",)
    (Pdb) continue
    12
**********************************************************************
1 items had failures:
   1 of   4 in test.test_doctest.test_debug
***Test Failed*** 1 failures.
test test_doctest failed -- 1 of 418 doctests failed
test_gc
test test_gc failed -- Traceback (most recent call last):
  File "c:\dev\python\py3k\lib\test\test_gc.py", line 193, in test_saveall
    self.assertEqual(gc.garbage, [])
AssertionError: [<io.BytesIO object at 0x01237968>] != []

2 tests failed:
    test_doctest test_gc
msg58205 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2007-12-05 00:34
After some hard debugging:
- doctest.debug_src() is unlucky enough to trigger a garbage collection
just when compiling the given code.
- gc collects unreachable objects, among them is an instance of the
class doctest._SpoofOut, which derives from io.StringIO.
- The debugger steps into io.IOBase.__del__

Some possible directions:
- Change the gc thresholds. A very temporary workaround to make the test
pass.
- Find the cycle involving the SpoofOut object, and try to break it in
doctest.py.
- Find a way to disable pdb tracing when the gc is running finalizers.
(this is what happens in 2.5: pdb does not step into a C function)
- Forget everything, and wait for the io.py object to be rewritten in C.
msg58218 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2007-12-05 16:51
Finally I found a potential problem with the garbage collector in a
specific case:
- Some object C participates in a reference cycle, and contains a
reference to another object X.
- X.__del__ 'resurrect' the object, by saving its 'self' somewhere else.
- X contains a reference to Y, which has a __del__ method.
When collecting all this, gc.garbage == [Y] !
This is not true garbage: if you clean gc.garbage, then the next
gc.collect() clears everything.


Now, try to follow my explanation (if I correctly understand the gc
internals):
- When the cycle is garbage collected, X and Y are detected as
'unreachable with finalizers', and put in a specific 'finalizers' list.
- the cycle is broken, C is deallocated. 
- This correctly triggers X.__del__. X is removed from the 'finalizers'
list.
- when X is resurrected, it comes back to the normal gc tracking list.
- At the end, 'finalizers' contains 3 objects: X.__dict__, Y and Y.__dict__.
- Y is then considered as garbage.

I join a script which reproduces the behaviour. Note that 2.4 and 2.5
are affected too.
In py3k, the 'resurrect' seems to be caused by a (caught) exception in
TextIOWrapper.close(): the exception contains a reference to the frame,
which references the self variable.
msg58221 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-05 17:25
Hoping to draw Tim into this... He's the only one I know who truly
understands these issues...
msg99522 - (view) Author: Jeremy Hylton (Jeremy.Hylton) (Python committer) Date: 2010-02-18 19:46
I'm trying to figure out the attached script.  If I run Python 3.0, the script doesn't run because of the undefined gc.DEBUG_OBJECTS.  If I just remove that, the script runs without error.  Does that mean the problem is fixed?  Or is running without an error an example of the problem?

If I add gc.DEBUG_SAVEALL, it fails--but that seems obvious because DEBUG_SAVEALL adds all objects with finalizers to gc.garbage.
msg99523 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-02-18 19:52
The attached script does not "fail". The weird thing is that after gc.collect(), gc.garbage is not empty.
This happens even when gc.set_debug() is not called.
msg99555 - (view) Author: Jeremy Hylton (Jeremy.Hylton) (Python committer) Date: 2010-02-19 03:39
I spent some time to understand the example script today.  The specific issue is that a set of objects get put into the list of unreachable objects with finalizers (both Immutable and Finalizer instances).  When Cycle's __dict__ is cleared, it also decrefs Immutable which resurrects it and Finalizer.  The garbage collector is not prepared for an unreachable finalizer object to become reachable again.  More generally, it's hard to assume anything about the state of the finalizers after unreachable trash is collected.  I'll think more about what to do, but I don't see any easy solutions.
msg99557 - (view) Author: Jeremy Hylton (Jeremy.Hylton) (Python committer) Date: 2010-02-19 05:23
One last thought on this bug.  The problem is that after we try to delete garbage, we really can't know much about the state of the objects in the finalizers list.  If any of the objects that are cleared end up causing a finalizer to run, then any of the objects in the finalizers list may be reachable again.  One possibility is to do nothing with the objects in the finalizers list if there was any garbage to delete.  That means objects with finalizers would be harder to get to gc.collect()--for example, you'd need to call gc.collect() twice in a row.  The first time to clear garbage, the second time to handle unreachable objects with finalizers.  Or the GC could run a second time if garbage was cleared and finalizers was non-empty.

A more complicated possibility would be to track some object state about when a finalizer was run.  If any of the objects in finalizers had a finalizer that ran while garbage was cleared, we could skip the finalizers list.  I don't know how to implement this, since an arbitrary C type could run Python code in tp_dealloc without notifying GC.
msg99565 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-02-19 09:47
What if we simply run another collection on the finalizers list?
after move_finalizers(), set young=finalizers, and start again at update_refs().
This is the "garbage" list anyway, so it's empty most of the time and should not cause any slowdown.
msg99590 - (view) Author: Jeremy Hylton (Jeremy.Hylton) (Python committer) Date: 2010-02-19 20:06
Amaury-- I think that will work.  I put together a small patch that seems to pass all the tests, but it too messy.  We need some care to make sure we don't spin forever if there's some degenerate case where we never escape GC.
msg99593 - (view) Author: Jeremy Hylton (Jeremy.Hylton) (Python committer) Date: 2010-02-19 20:58
The code is still in no shape to submit.  It has lots of debugging prints in it, etc. but the basic structure might work.  Do you want to let me know if it makes sense?
msg116797 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-09-18 15:24
Can we have an update on this please as it seems important.
msg200765 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-10-21 12:26
ping :)
msg200775 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-10-21 13:17
Is the error still current? io.StringIO is now completely implemented in _io/textio.c, and should not have any Python-level __del__.
msg200776 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-10-21 13:34
I don't know ... Is somebody able to test it?
msg201030 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-23 14:21
The GC behaves gracefully in 3.4: the gc_bug script shows no uncollectable object. I don't think this is worth fixing in 3.3.
msg201031 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-10-23 14:25
I agree with not fixing 3.3.
History
Date User Action Args
2013-10-23 14:40:02pitrousetstatus: open -> closed
2013-10-23 14:25:52christian.heimessetstatus: pending -> open

messages: + msg201031
2013-10-23 14:21:35pitrousetstatus: open -> pending
versions: - Python 3.4
nosy: + pitrou

messages: + msg201030

resolution: out of date
2013-10-21 13:34:08christian.heimessetmessages: + msg200776
2013-10-21 13:17:44amaury.forgeotdarcsetmessages: + msg200775
2013-10-21 12:26:52christian.heimessetmessages: + msg200765
versions: - Python 3.2
2012-11-26 15:17:43christian.heimessetstage: patch review
type: behavior
versions: + Python 3.2, Python 3.3, Python 3.4, - Python 3.0
2010-09-18 15:24:52BreamoreBoysetnosy: + BreamoreBoy
messages: + msg116797
2010-02-19 20:58:56Jeremy.Hyltonsetfiles: + gcmodule.c.patch
keywords: + patch
messages: + msg99593
2010-02-19 20:06:34Jeremy.Hyltonsetmessages: + msg99590
2010-02-19 09:47:50amaury.forgeotdarcsetmessages: + msg99565
2010-02-19 05:23:06Jeremy.Hyltonsetmessages: + msg99557
2010-02-19 03:39:09Jeremy.Hyltonsetmessages: + msg99555
2010-02-18 19:52:34amaury.forgeotdarcsetmessages: + msg99523
2010-02-18 19:46:09Jeremy.Hyltonsetnosy: + Jeremy.Hylton
messages: + msg99522
2010-02-18 09:56:28ezio.melottisetnosy: + ezio.melotti, brian.curtin
2008-01-06 22:29:44adminsetkeywords: - py3k
versions: Python 3.0
2007-12-05 17:25:27gvanrossumsetassignee: tim.peters
messages: + msg58221
nosy: + tim.peters
2007-12-05 16:51:42amaury.forgeotdarcsetfiles: + gc_bug.py
nosy: + gvanrossum
messages: + msg58218
components: + Interpreter Core
2007-12-05 00:34:04amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg58205
2007-12-02 14:41:56christian.heimescreate