classification
Title: Document that lru_cache uses hard references
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution: duplicate
Dependencies: Superseder: functools.lru_cache keeps objects alive forever
View: 19859
Assigned To: rhettinger Nosy List: Wouter De Borger2, cryvate, kj, nanjekyejoannah, pablogsal, python-dev, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-06-04 11:45 by Wouter De Borger2, last changed 2021-06-08 15:42 by cryvate.

Pull Requests
URL Status Linked Edit
PR 26528 open python-dev, 2021-06-04 12:22
Messages (10)
msg395075 - (view) Author: Wouter De Borger (Wouter De Borger2) Date: 2021-06-04 11:45
# Problem

the functools.lru_cache decorator locks all arguments to the function in memory (inclusing self), causing hard to find memory leaks. 

# Expected  

I had assumed that the lru_cache would keep weak-references and that when an object is garbage colected, all its cache entries expire as unreachable. This is not the case.

# Solutions 

1. I think it is worth at least mentioning this behavior in de documentation. 
2. I also think it would be good if the LRU cache actually uses weak references. 

I will try to make a PR for this.
msg395087 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-04 14:46
Using a weak dictionary is not a correct solution as the cache must take string ownership of the arguments and return value to do it's job properly. Moreover, there are many types in Python that don't support weak references so this will be a backwards incompatible change and limiting the cache quite a lot.
msg395088 - (view) Author: Ken Jin (kj) * (Python triager) Date: 2021-06-04 14:55
@Wouter
Hmm, I thought most use cases of lru_cache benefit from strong references for predictable hit rates? I'm not an expert in this area, so I nosied-in someone else who is.

However, I noticed that the current doc doesn't mention the strong reference behavior anywhere. So I think your suggestion to amend the docs is an improvement, thanks!
msg395125 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-06-04 21:31
Also note that many important objects in Python are not weak referenceable, tuples for example.
msg395144 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-06-05 00:06
I'm thinking of a more minimal and targeted edit than what is in the PR.   

Per the dev guide, we usually word the docs in an affirmative and specific manner (here is what the tool does and an example of how to use it).  Recounting a specific debugging case or misassumption usually isn't worthwhile unless it is a common misconception.

For strong versus weak references, we've had no previous reports even though the lru_cache() has been around for a long time.  Likely, that is because the standard library uses strong references everywhere unless specifically documented to the contrary.  Otherwise, we would have to add a strong reference note to everything stateful object in the language.

Another reason that it likely hasn't mattered to other users is that an lru cache automatically purges old entries.  If an object is not longer used, it cycles out as new items are added to the cache.  Arguably, a key feature of an LRU algorithm is that you don't have to think about the lifetime of objects.   

I'll think it a for a while and will propose an alternate edit that focuses on how the cache works with methods.  The essential point is that the instance is included in the cache key (which is usually what people want).  Discussing weak vs strong references is likely just a distractor.
msg395149 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-05 02:13
Agreed! I will let the PR to you :)
msg395152 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-06-05 04:34
It may useful to link back to @cached_property() for folks wanting method caching tied to the lifespan of an instance rather than actual LRU logic.
msg395157 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-06-05 09:22
This is a full duplicate of issue19859. Both ideas of using weak references and changing documentation were rejected.
msg395158 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2021-06-05 09:31
I saw the thread but the idea was rejected by @rhettinger who seems to suggest the changes in the documentation this time himself.

Maybe he has changed his mind, in which case he can explain the circumstances of his decisions if he wants.
msg395336 - (view) Author: Henk-Jaap Wagenaar (cryvate) * Date: 2021-06-08 15:42
Reading this bug thread last week made me realize we had made the following error in our code:


class SomethingView():
    @functools.lru_cache()
    def get_object(self):
        return self._object


Now, as this class was instantiated for every (particular kind of) request to a webserver and this method called (a few times), the lru_cache just kept filling up and up. We had been having a memory leak we couldn't track down, and this was it.

I think this is an easy mistake to make and it was rooted, not so much in hard references though (without that though, it would have not leaked memory) but because of the fact the cache lives on the class and not the object.
History
Date User Action Args
2021-06-08 15:42:07cryvatesetnosy: + cryvate
messages: + msg395336
2021-06-05 09:31:58nanjekyejoannahsetmessages: + msg395158
2021-06-05 09:22:10serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg395157
resolution: duplicate

superseder: functools.lru_cache keeps objects alive forever
2021-06-05 04:34:04rhettingersetmessages: + msg395152
2021-06-05 02:13:28pablogsalsetmessages: + msg395149
2021-06-05 00:06:37rhettingersetmessages: + msg395144
2021-06-04 21:31:51rhettingersetmessages: + msg395125
title: lru_cache memory leak -> Document that lru_cache uses hard references
2021-06-04 21:27:57rhettingersetassignee: rhettinger
2021-06-04 21:19:57terry.reedysetversions: - Python 3.6, Python 3.7, Python 3.8
2021-06-04 14:55:49kjsetnosy: + rhettinger, kj
messages: + msg395088
2021-06-04 14:46:56pablogsalsetmessages: + msg395087
2021-06-04 12:39:59nanjekyejoannahsetnosy: + pablogsal, nanjekyejoannah
2021-06-04 12:22:20python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request25122
stage: patch review
2021-06-04 11:53:01Wouter De Borger2settype: resource usage
2021-06-04 11:45:58Wouter De Borger2create