classification
Title: lru_cache enhancement: lru_timestamp helper function
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: benhoyt, peter@psantoro.net, rhettinger
Priority: low Keywords:

Created on 2013-07-28 12:32 by peter@psantoro.net, last changed 2014-09-03 16:05 by benhoyt. This issue is now closed.

Files
File name Uploaded Description Edit
lru.py peter@psantoro.net, 2013-07-28 12:32 lru_timestamp implementation and test script
lru.py peter@psantoro.net, 2013-07-28 14:33 updated lru_timestamp implementation and test script
lru.py peter@psantoro.net, 2013-07-29 20:16 updated lru_timestamp implementation and test script
Messages (7)
msg193820 - (view) Author: Peter Santoro (peter@psantoro.net) * Date: 2013-07-28 12:32
The attached proposed lru_timestamp function provides developers with more control over how often lru_cache entries are refreshed.  Doc string follows:

def lru_timestamp(refresh_interval=60):
""" Return a timestamp string for @lru_cache decorated functions.

    The returned timestamp is used as the value of an extra parameter
    to @lru_cache decorated functions, allowing for more control over
    how often cache entries are refreshed. The lru_timestamp function
    should be called with the same refresh_interval value for a given
    lru_cache decorated function. 

    Positional arguments:
    refresh_interval -- 1-1440 minutes (default 60) as int or float

    """

Rationale:

Some functions have input parameters that rarely change, but yet return different results over time.  It would be nice to have a ready-made solution to force lru_cache entries to be refreshed at specified time intervals.

An common example is using a stable userid to read user information from a database.  By itself, the lru_cache decorator can be used to cache the user information and prevent unnecessary i/o.  However, if a given user's information is updated in the database, but the associated lru_cache entry has not yet been discarded, the application will be using stale data.  The lru_timestamp function is a simple, ready-made helper function that gives the developer more control over the age of lru_cache entries in such situations.

Sample usage:

@functools.lru_cache()
    def user_info(userid, timestamp):
        # expensive database i/o, but value changes over time
        # the timestamp parameter is normally not used, it is
        # for the benefit of the @lru_cache decorator
        pass

# read user info from database, if not in cache or
# older than 120 minutes
info = user_info('johndoe', functools.lru_timestamp(120))
msg193827 - (view) Author: Peter Santoro (peter@psantoro.net) * Date: 2013-07-28 14:33
I updated my proposed lru_timestamp function with the following changes:

1) restricted refresh_interval to int type
2) updated doc string

Updated doc string follows:

def lru_timestamp(refresh_interval=60):
    """ Return a timestamp string for @lru_cache decorated functions.

    The returned timestamp is used as the value of an extra parameter
    to @lru_cache decorated functions, allowing for more control over
    how often cache entries are refreshed. The lru_timestamp function
    should be called with the same refresh_interval value for a given
    @lru_cache decorated function.  The returned timestamp is for the
    benefit of the @lru_cache decorator and is normally not used by
    the decorated function.

    Positional arguments:
    refresh_interval -- in minutes (default 60), values less than 1
                        are coerced to 1, values more than 1440 are
                        coerced to 1440

    """
msg193896 - (view) Author: Peter Santoro (peter@psantoro.net) * Date: 2013-07-29 20:16
I updated my proposed lru_timestamp function with the following change:

1) raise TypeError instead of ValueError
msg197355 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-09-09 07:56
This is a pretty interesting idea.

Ideally, it would be great if it could be a published as a recipe somewhere so that people could experiment with the API and try out variations.  If there were good uptake by users, it would help justify a proposal to be included in the standard library.
msg199363 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-10-10 05:48
Please publish this outside the standard library so it can mature and get user feedback.  I think it would be premature to add it right now.   The subject of cache entry invalidation or expiration is broad.  I'm not sure this is the best way to do it.
msg209969 - (view) Author: Peter Santoro (peter@psantoro.net) * Date: 2014-02-02 11:15
As requested, I published this for review on http://code.activestate.com/recipes/578817-lru_timestamp-cache-entry-aging-for-functoolslru_c/
msg226312 - (view) Author: Ben Hoyt (benhoyt) * Date: 2014-09-03 16:05
I really like this idea (and am needing this functionality), but I don't think this API (or implementation) is very nice:

1) It means you have to change your function signature to use the timeout feature.

2) Specifying the interval in minutes seems odd (most similar timeouts in Python are specified in seconds).

I would love to see an optional timeout=seconds keyword arg to the lru_cache() decorator, or some other way to support this.

Raymond, what do you think would be the simplest way to hook this in?

One way I think would be nice (and also support other neat things) is to allow you to specify the dict-like object that's used for the cache (defaults to "dict", of course). So the lru_cache signature would change to:

def lru_cache(maxsize=100, typed=False, cache=None):
    ...

From looking at the source, cache would need to support these methods: get, clear, __setitem__, __contains__, __len__, __delitem__

Would this just work? Or could there be a race condition if __contains__ (key in cache) returned True but then cache.get(key) returned False a bit later?

In any case, this seems nice and general to me, and would mean you could implement a simple ExpiringDict() and then pass that as cache=ExpiringDict(expiry_time).

Thoughts?
History
Date User Action Args
2014-09-03 16:05:42benhoytsetnosy: + benhoyt
messages: + msg226312
2014-02-02 11:15:36peter@psantoro.netsetmessages: + msg209969
2013-10-10 05:48:15rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg199363
2013-09-09 07:56:20rhettingersetpriority: normal -> low

messages: + msg197355
2013-07-29 20:16:35peter@psantoro.netsetfiles: + lru.py

messages: + msg193896
2013-07-28 18:38:59rhettingersetassignee: rhettinger

nosy: + rhettinger
versions: + Python 3.4, - Python 3.3
2013-07-28 14:33:04peter@psantoro.netsetfiles: + lru.py

messages: + msg193827
2013-07-28 12:32:51peter@psantoro.netcreate