Title: optimize lru_cache for functions with no arguments
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.10
Status: closed Resolution: rejected
Assigned To: Nosy List: ammar2, eltoder, python-dev, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-01-12 04:31 by eltoder, last changed 2022-04-11 14:59 by admin.

msg384883 - (view) Author: Eugene Toder (eltoder) * Date: 2021-01-12 04:31
It's convenient to use @lru_cache on functions with no arguments to delay doing some work until the first time it is needed. Since @lru_cache is implemented in C, it is already faster than manually caching in a closure variable. However, it can be made even faster and more memory efficient by not using the dict at all and caching just the one result that the function returns.

Here are my timing results. Before my changes:

$ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()"
5000000 loops, best of 5: 42.2 nsec per loop
$ ./python -m timeit -s "import functools; f = functools.lru_cache(None)(lambda: 1)" "f()"
5000000 loops, best of 5: 38.9 nsec per loop

After my changes:

$ ./python -m timeit -s "import functools; f = functools.lru_cache()(lambda: 1)" "f()"
10000000 loops, best of 5: 22.6 nsec per loop

So we get improvement of about 80% compared to the default maxsize and about 70% compared to maxsize=None.
msg384891 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-12 05:40
Just use the new @cache decorator.¹  It's cleaner looking in code and already sets maxsize to None, making it perfect for your application.

With respect to the proposed optimization, I'm sorry but further optimization of this already fast special case isn't worth the added complexity.  It is almost certain that these few nanoseconds won't ever matter in a real application.  The @cache decorator is already faster than calling an empty function, "def f(): return None".

msg384893 - (view) Author: Eugene Toder (eltoder) * Date: 2021-01-12 06:12
As you can see in my original post, the difference between @cache (aka @lru_cache(None)) and just @lru_cache() is negligible in this case. The optimization in this PR makes a much bigger difference. At the expense of some lines of code, that's true.

Also, function calls in Python are quite slow, so being faster than a function call is not a high bar.
msg384894 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-12 06:20
Some other thoughts:

* A zero argument function that returns a constant is unlikely to ever be used in a tight loop. That would be pointless.

* The @cache decorator is already 30% faster than calling an empty function. It's very cheap.

* We really don't want the cache logic to get into the business of trying to deduce the arity of the function being cached.  That is a can of worms that we would regret opening.
msg384895 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-01-12 06:29
Additional discussion on the same topic on discourse:
msg384896 - (view) Author: Eugene Toder (eltoder) * Date: 2021-01-12 06:39
Ammar, thank you for the link.
msg384897 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-12 07:33
FYI: The @cache decorator was added as a result of that discussion and the related on python-ideas.
msg384952 - (view) Author: Eugene Toder (eltoder) * Date: 2021-01-12 15:28
@cache does not address the problem or any of the concerns brought up in the thread. Thread-safe @once is a nice idea, but more work of course.
