classification
Title: iterators broken for weak dicts
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: weak dict iterators are fragile because of unpredictable GC runs
View: 7105
Assigned To: Nosy List: BreamoreBoy, ajaksu2, dcjim, elachuni, mark.dickinson, pitrou, tseaver, vdupras
Priority: normal Keywords: patch

Created on 2003-11-10 11:32 by dcjim, last changed 2010-07-23 09:56 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
weakrefbug.py dcjim, 2003-11-10 11:32 Script that demonstrates the bug
weakref_dict_iter.diff vdupras, 2008-02-24 14:25
Messages (14)
msg18959 - (view) Author: Jim Fulton (dcjim) (Python triager) Date: 2003-11-10 11:32
You can't use iterators on wekref dicts because items
might be removed from the dictionaries while iterating
due to GC.

I've attached a script that illustrates the bug with
Python 2.3.2. It doesn't matter whether you use weak
key or weak value dicts.

If this can't be fixed, then the iteration methods should
either be removed or made to (lamely) create intermediate
lists to work around the problem.

msg62908 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2008-02-24 14:25
I made a patch to fix the problem. The cleaning up of they weakref keys or 
values will be held until all references to iterators created by the 
weakdict are dead.

I also couldn't resist removing code duplication of code in items(), 
keys() and values().

At first, I couldn't understand why this whole remove(), _remove() and 
selfref() mechanism was in place. I had removed them and replaced them 
with methods, and the tests still passed. Then I realized it was to make 
sure keys and values didn't prevent the weak dicts from being freed. I 
added tests for this.
msg82020 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-02-14 12:09
Patch has tests, may need updating.
msg82037 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-14 12:46
Interesting patch. I think the intermediate assertEquals in
test_weak_*_dict_flushed_dead_items_when_iters_go_out are just testing
an implementation detail, only the final one should remain.

Also, it is likely the "code duplication" you are talking about was
there for performance reasons, so I'd suggest putting it back in.
msg82062 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2009-02-14 14:11
About duplicated code and performance:

When I look at the duplicated code, I don't see anything that remotely 
looks like a performance tweak. Just to make sure, I made a bench:

#!/usr/bin/env python
import sys
sys.path.insert(0, 'Lib')
import timeit
import weakref

class Foo(object): pass
def setup():
    L = [Foo() for i in range(1000)]
    global d
    d = weakref.WeakValueDictionary(enumerate(L))
    del L[:500] # have some dead weakrefs

def do():
    d.values()

print timeit.timeit(do, setup, number=100000)

Results without the patch:
 ./python.exe weakref_bench.py
0.804216861725

Results with the patch:
$ ./python.exe weakref_bench.py
0.813000202179

I think the small difference in performance is more attributable to the 
extra processing the weakref dict does than the deduplication of the 
code itself.

About the test_weak_*_dict_flushed_dead_items_when_iters_go_out:

If a weakref dict keeps its weak reference alive, it's not an 
implementation detail, it's a bug. The whole point of using such dicts 
is to not keep keys or values alive when they go out.
msg82066 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2009-02-14 14:25
Oh, that's me again not correctly reading my own tests. It's the 
*_are_not_held_* tests that test that no reference is kept.
I agree about the *_flushed_dead_items_* being an implementation detail.
msg82067 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-14 14:28
> Results without the patch:
>  ./python.exe weakref_bench.py
> 0.804216861725
> 
> Results with the patch:
> $ ./python.exe weakref_bench.py
> 0.813000202179

Thanks for the numbers, I see my worries were unfounded.

> About the test_weak_*_dict_flushed_dead_items_when_iters_go_out:
> 
> If a weakref dict keeps its weak reference alive, it's not an 
> implementation detail, it's a bug. The whole point of using such dicts 
> is to not keep keys or values alive when they go out.

I was talking about doing `self.assertEqual(len(d), self.COUNT)` before
deleting the iterators.
msg104842 - (view) Author: Tres Seaver (tseaver) * Date: 2010-05-03 15:08
I can confirm that the patch applies with minimal fuzz to the
release26-maint branches and the trunk, and that the added tests fail
without the updated implementation in both cases.

Furthermore, Jim's original demo script emits it error with my stock 2.6.5
Python, but is silent with the patched trunk / 2.6 branch.
msg105159 - (view) Author: Anthony Lenton (elachuni) * Date: 2010-05-06 19:33
Probably old news, but this also affects 2.5.4.
msg110338 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-14 22:47
If this is to go forward the patch will need porting to 2.7, 3.1 and 3.2
msg111254 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2010-07-23 09:37
It looks like this issue has been fixed in issue7105 already. Can we close this ticket?
msg111256 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-23 09:49
It's not yet fixed in 2.7 or 2.6.  Updating versions.
msg111257 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2010-07-23 09:53
We might as well backport Antoine's patch rather than take this one (even if mine for 2.x already). It would be weird to have 2 wildly different patches to solve the same problem.

Maybe close this ticket and flag issue7105 for backporting?
msg111258 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-23 09:56
Agreed.
History
Date User Action Args
2010-07-23 09:56:04mark.dickinsonsetstatus: open -> closed
resolution: duplicate
superseder: weak dict iterators are fragile because of unpredictable GC runs
messages: + msg111258
2010-07-23 09:53:30vduprassetmessages: + msg111257
2010-07-23 09:49:15mark.dickinsonsetnosy: + mark.dickinson

messages: + msg111256
versions: + Python 2.6, - Python 3.1, Python 3.2
2010-07-23 09:37:48vduprassetmessages: + msg111254
2010-07-14 22:47:08BreamoreBoysetnosy: + BreamoreBoy

messages: + msg110338
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 2.5
2010-05-06 19:33:37elachunisetnosy: + elachuni

messages: + msg105159
versions: + Python 2.5
2010-05-03 15:08:36tseaversetnosy: + tseaver
messages: + msg104842
2009-02-14 14:28:46pitrousetmessages: + msg82067
2009-02-14 14:25:07vduprassetmessages: + msg82066
2009-02-14 14:11:27vduprassetmessages: + msg82062
2009-02-14 12:46:19pitrousetnosy: + pitrou
messages: + msg82037
2009-02-14 12:09:13ajaksu2setnosy: + ajaksu2
messages: + msg82020
stage: patch review
2008-02-24 14:25:37vduprassetfiles: + weakref_dict_iter.diff
versions: + Python 2.6
nosy: + vdupras
messages: + msg62908
keywords: + patch
type: behavior
2008-02-22 20:22:21akuchlingsettitle: interators broken for weak dicts -> iterators broken for weak dicts
2003-11-10 11:32:12dcjimcreate