Issue 33827: Generators with lru_cache can be non-intuituve

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/78008

classification

Title:	Generators with lru_cache can be non-intuituve
Type:	behavior	Stage:	resolved
Components:		Versions:

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:	rhettinger	Nosy List:	exhuma, rhettinger, serhiy.storchaka
Priority:	normal	Keywords:

Created on 2018-06-11 07:16 by exhuma, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg319279 - (view)	Author: Michel Albert (exhuma) *	Date: 2018-06-11 07:16
Consider the following code: # filename: foo.py from functools import lru_cache @lru_cache(10) def bar(): yield 10 yield 20 yield 30 # This loop will work as expected for row in bar(): print(row) # This loop will not loop over anything. # The cache will return an already consumed generator. for row in bar(): print(row) This behaviour is natural, but it is almost invisible to the caller of "foo". The main issue is one of "surprise". When inspecting the output of "foo" it is clear that the output is a generator: >>> import foo >>> foo.bar() <generator object bar at 0x7fbfecb66a40> Very careful inspection will reveal that each call will return the same generator instance. So to an observant user the following is an expected behaviour: >>> result = foo.bar() >>> for row in result: ... print(row) ... 10 20 30 >>> for row in result: ... print(row) ... >>> However, the following is not: >>> import foo >>> result = foo.bar() >>> for row in result: ... print(row) ... 10 20 30 >>> result = foo.bar() >>> for row in result: ... print(row) ... >>> Would it make sense to emit a warning (or even raise an exception) in `lru_cache` if the return value of the cached function is a generator? I can think of situation where it makes sense to combine the two. For example the situation I am currently in: I have a piece of code which loops several times over the same SNMP table. Having a generator makes the application far more responsive. And having the cache makes it even faster on subsequent calls. But the gain I get from the cache is bigger than the gain from the generator. So I would be okay with converting the result to a list before storing it in the cache. What is your opinion on this issue? Would it make sense to add a warning?
msg319281 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-06-11 08:02
No, this will break cases when you need to cache generators. There are many ways of using lru_cache improperly, and we can't distinguish incorrect uses from intentional correct uses.
msg319363 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2018-06-12 05:25
Serhiy is correct. In general, there is no way to detect when someone is caching something that should be cached (i.e. impure functions).

History
Date	User	Action	Args
2022-04-11 14:59:01	admin	set	github: 78008
2018-06-12 05:25:46	rhettinger	set	status: open -> closed resolution: not a bug messages: + msg319363 stage: resolved
2018-06-11 08:02:25	serhiy.storchaka	set	assignee: rhettinger messages: + msg319281 nosy: + rhettinger, serhiy.storchaka
2018-06-11 07:16:13	exhuma	create