This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: WeakKeyDictionary/Mapping doesn't call __missing__
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: dfortunov, njs, pitrou, rhettinger
Priority: normal Keywords: patch

Created on 2017-08-22 07:31 by Antony.Lee, last changed 2022-04-11 14:58 by admin.

Messages (14)
msg300671 - (view) Author: Antony Lee (Antony.Lee) * Date: 2017-08-22 07:31
The following example, which raises a KeyError, shows that in a WeakKeyDictionary subclass that defines __missing__, that method doesn't get called.

    from weakref import WeakKeyDictionary

    class WeakKeyDictionaryWithMissing(WeakKeyDictionary):
        __missing__ = lambda: print("hello")

    class C: pass

    d = WeakKeyDictionaryWithMissing()
    d[C()]

This behavior is technically OK, as object.__missing__ is only documented in the datamodel to be called for dict subclasses, and WeakKeyDictionary is actually not a subclass of dict, but of MutableMapping.

Still, it would seem preferable if either WeakKeyDictionary did use __missing__, or perhaps, more reasonably, Mapping.__getitem__ did so.  (Or, at least, if the WeakKeyDictionary class clearly stated that it does not inherit from dict.  Note that the docs start with "Mapping class that references keys weakly. Entries in the *dictionary* etc." (emphasis mine))
msg300739 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-23 03:31
For WeakKeyDictionary, I suspect that adding a __missing__() call would make the API more tricky and would likely cause more problems than it solves.

Without a compelling use case, my recommendation would be to leave it alone.  (FWIW, all of the weak reference data structures have a long history of bugs, it seems that this is difficult to get right).
msg300789 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-08-24 16:15
If someone wants to submit a PR, it would help judge the complexity and fragility of adding support for this.  I cannot think of any problem upfront myself, but you're right that weak containers have been very difficult to make robust against various conditions (and it's probably not over yet).
msg300794 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-24 18:01
We should still ask about use cases though.  Without a motivating use case, why churn the code, complexify the API (possibly making it more tricky to use), and risk introducing bugs?

AFAICT, the OP's sole motivation was "still, it would seem preferable ..." which is no more compelling than the usual "it might be nice if ...".  

The weak reference containers have been around for a long time and I don't think anyone has ever reported that they actually needed this functionality.
msg300800 - (view) Author: Antony Lee (Antony.Lee) * Date: 2017-08-24 19:53
The use case is to generate a mapping of weakly-held objects to unique ids, with something like

id_map = WeakKeyDictionaryWithMissing(lambda *, _counter=itertools.count(): next(_counter))

Of course, as always when using defaultdict, it is easy enough to instead implement this by manually checking if the key is present and store a new id in this case -- but this is as well an argument for not having defaultdict in the first place.
msg300810 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-08-24 22:16
> Of course, as always when using defaultdict, it is easy enough to instead implement this by manually checking if the key is present and store a new id in this case 

Or, better, use setdefault(), since manually checking suffers from a race condition in the weak dictionary case.
msg300816 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-25 00:31
[Anthony Lee]
> The use case is to generate a mapping of weakly-held objects 
> to unique ids, with something like
> 
> id_map = WeakKeyDictionaryWithMissing(lambda *, _counter=itertools.count(): > next(_counter))

Where are you keeping hard references to the keys?  ISTM, you only have a weak reference, so the object has no hard references.  Entries in the dictionary are discarded when there is no longer a strong reference to the key.

Why did you decide to use a dictionary?  AFAICT, nothing external to the dictionary knows about the keys so there is no way to do lookups.

Overall, it doesn't seem like a WeakKeyDictionary with a __missing__() method is the right approach for this problem.   It makes me question where it makes any sense at all to auto-generate missing keys for a WeakKeyDictionary.
msg300817 - (view) Author: Antony Lee (Antony.Lee) * Date: 2017-08-25 01:31
The original example did not cover the use case and was only there to show the current behavior.  The real use case is more something like

obj = (get obj as argument to function, caller has a hard reference to it)
uid = d[obj]
msg300818 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-25 03:04
Just for comparison, what is your best solution without the proposed API change?    It is helpful to see user code before-and-after to know whether there is an improvement that makes the change worth it.
msg300821 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-08-25 03:52
One way to do it:
-----------------

diff --git a/Lib/weakref.py b/Lib/weakref.py
index 1802f32a20..18f26ea8b2 100644
--- a/Lib/weakref.py
+++ b/Lib/weakref.py
@@ -136,6 +136,8 @@ class WeakValueDictionary(collections.abc.MutableMapping):
             self._commit_removals()
         o = self.data[key]()
         if o is None:
+            if hasattr(self, '__missing__'):
+                return self.__missing__(key)
             raise KeyError(key)
         else:
             return o

Another way to do it:
---------------------

diff --git a/Lib/weakref.py b/Lib/weakref.py
index 1802f32a20..9951b0fb06 100644
--- a/Lib/weakref.py
+++ b/Lib/weakref.py
@@ -131,12 +131,15 @@ class WeakValueDictionary(collections.abc.MutableMapping):
             key = l.pop()
             _remove_dead_weakref(d, key)

+    def __missing__(self, key):
+            raise KeyError(key)
+
     def __getitem__(self, key):
         if self._pending_removals:
             self._commit_removals()
         o = self.data[key]()
         if o is None:
-            raise KeyError(key)
+            return self.__missing__(key)
         else:
             return o
msg300826 - (view) Author: Antony Lee (Antony.Lee) * Date: 2017-08-25 05:35
For my use case, it was easy enough to wrap the `uid = d[obj]` in a try... catch... plus locking.

Using setdefault would cause the counter to be incremented every time.  In truth, here I did not care about having consecutive uids, so that would have worked just as well.

In real truth, it later turned out that I didn't really need numeric uids anyways; I could just use weakrefs to the object themselves instead.
msg300831 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-08-25 10:13
Antony, if you write (line numbers added for clarity):

1. try:
2.    d[obj]
3. except KeyError:
4.    d[obj] = some new value...
5. # expect d[obj] to exist at this point

it is possible that d[obj] still exists at line 2 but not anymore at line 5 (because there would have been a garbage collection run in-between), at least if the original key is not `obj` but some other object equal to `obj`.  Using setdefault() would ensure that doesn't happen.
msg300848 - (view) Author: Antony Lee (Antony.Lee) * Date: 2017-08-25 17:19
Thanks for the clarification!

"at least if the original key is not `obj` but some other object equal to `obj`" does not apply in my case so I'm fine, but the example shows that this is tricky to get right...
msg370697 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2020-06-04 07:34
Just found this while searching to see if we had an existing way to combine WeakValueDictionary + defaultdict. The use case I've run into a few times is where you need a keyed collection of mutexes, like e.g. asyncio.Lock objects. You want to make sure that if there are multiple tasks trying to access the same key/resource, they all acquire the same lock, but you don't want to use a regular defaultdict, because that will end up growing endlessly as different keys/resources are encountered.

It'd be very handy to do something like:

lock_dict = WeakValueDictionary(default=asyncio.Lock)

async with lock_dict[key]:
    ...

and have everything magically work.

The alternative is to open-code the default handling at the call site, either:

# Wasteful: creates a new Lock object on every usage,
# regardless of whether it's actually needed.
async with lock_dict.setdefault(key, asyncio.Lock()):
    ...

Or else the verbose:

lock = lock_dict.get(key)
if lock is None:
    lock = lock_dict[key] = asyncio.Lock()
async with lock:
    ...
History
Date User Action Args
2022-04-11 14:58:51adminsetgithub: 75437
2020-06-07 17:16:01Antony.Leesetnosy: - Antony.Lee
2020-06-06 23:45:54njssetpull_requests: - pull_request19900
2020-06-06 23:40:56dfortunovsetkeywords: + patch
nosy: + dfortunov

pull_requests: + pull_request19900
stage: patch review
2020-06-04 07:34:47njssetnosy: + njs
messages: + msg370697
2017-08-25 17:19:19Antony.Leesetmessages: + msg300848
2017-08-25 10:13:02pitrousetmessages: + msg300831
2017-08-25 05:35:20Antony.Leesetmessages: + msg300826
2017-08-25 03:52:28rhettingersetmessages: + msg300821
2017-08-25 03:04:22rhettingersetmessages: + msg300818
2017-08-25 01:31:42Antony.Leesetmessages: + msg300817
2017-08-25 00:31:23rhettingersetmessages: + msg300816
2017-08-24 22:16:03pitrousetmessages: + msg300810
2017-08-24 19:53:45Antony.Leesetmessages: + msg300800
2017-08-24 18:01:02rhettingersetmessages: + msg300794
2017-08-24 16:15:04pitrousetnosy: + pitrou
messages: + msg300789
2017-08-23 03:31:28rhettingersetversions: - Python 3.6
nosy: + rhettinger

messages: + msg300739

type: enhancement
2017-08-22 07:31:12Antony.Leecreate