msg111790 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-07-28 11:27 |
The standard library has several cache implementations (e.g. in re, fnmatch and ElementTree) with different cache size limiting strategies. These should be standardised and possibly even exposed for general use.
Refer to python-dev discussion:
http://mail.python.org/pipermail/python-dev/2010-July/102473.html
|
msg113273 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-08-08 15:44 |
With Raymond adding functools.lru_cache and functools.lfu_cache, it should be possible to use those for the various caches in the standard library.
My only point of concern is that the standard lib caches tend to allow dynamic modification of the max cache size, while the new cache decorators appear to be fixed at the size specified when defining the decorator.
|
msg113300 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-08-08 19:22 |
I will take a look at the various caches and see if some of their features can be consolidated. It is okay if some need to have their own strategies or situation specific approaches, but you're right that common features should be looked at to see if they make sense in the functools caches.
Am not sure that dynamically changing the maxsize is a generally useful feature (most are set to the largest size that makes sense and not revisited latter), but I will take a look to whether that complicates the code (if it's simple and fast, it may be worth doing).
|
msg113374 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-08-09 04:08 |
Yeah, I'm not sure the dynamic resizing makes sense in general. I was just pointing it out as something supported by the existing caches that could complicate a consolidation effort.
|
msg113378 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-08-09 04:29 |
Applied the lru_cache() to fnmatch and re.
See r83874 r83871.
I did find a simple way to dynamically resize the maxcache, but did not apply it yet. Will look at more applications to see if it is really needed.
Nick, thanks for the great ideas. These changes simplified the code where they were applied and resulted in a smarter caching strategy.
|
msg113382 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-08-09 04:52 |
On Mon, Aug 9, 2010 at 2:29 PM, Raymond Hettinger
<report@bugs.python.org> wrote:
> I did find a simple way to dynamically resize the maxcache, but did not apply it yet. Will look at more applications to see if it is really needed.
>
> Nick, thanks for the great ideas. These changes simplified the code where they were applied and resulted in a smarter caching strategy.
The reason I mentioned the dynamic sizing specifically was that the
discussions where we realised we had all these different caches
floating around had to do with tuning the caching strategy to a
particular application based on speed vs memory trade-offs. While we
can pick a number that is a reasonable default, a server deployment
may want to turn the dial towards faster response times with higher
memory consumption, while an embedded device may want to push the dial
the other way. The smarter caching strategies you added are likely to
help far more than the blunt hammer approach of increasing the cache
size, but the speed/memory trade-off in choosing that size is still a
question that has no universally correct answer.
|
msg113383 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-08-09 05:11 |
I was thinking about the problem of developers wanting a different cache size than that provided in standard lib modules.
ISTM that now we've offered caching abilities in functools, a developer can easily add another layer of cache around any API they are interested in. For example, if someone is using thousands of recurring fnmatch patterns, they can write something like:
@functools.lfu_cache(maxsize=10000) # custom fat cache
def fnmatchcase(*args):
return fnmatch.fnmatchcase(*args)
IMO, this beats adding caching controls to lots of APIs that should be focused only on their problem domain. IOW, it is probably not a good idea to add purge() and cache_resize() functions to multiple modules throughout the standard lib.
ISTM, we should just provide basic caching with reasonable space consumption (i.e. not huge) that gives improvements to common use cases (like I've done with the fnmatch and re module) and let programmers with unusual cases add their own caching options rather that be tied into our choice of lru vs lfu or whatnot.
|
msg113385 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-08-09 05:51 |
On Mon, Aug 9, 2010 at 3:11 PM, Raymond Hettinger
<report@bugs.python.org> wrote:
> ISTM, we should just provide basic caching with reasonable space consumption (i.e. not huge) that gives improvements to common use cases (like I've done with the fnmatch and re module) and let programmers with unusual cases add their own caching options rather that be tied into our choice of lru vs lfu or whatnot.
A very good point! Perhaps we should note that somewhere? I'm not sure
where though, unless we just mention it in the docs for the relevant
modules..
Going the other way (using a smaller, or no, cache), perhaps in
addition to the new hit/miss attributes, the cache decorators should
expose the original function to allow the cache to be bypassed?
|
msg113570 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-08-10 23:31 |
Great minds think alike. I was just about to propose that functools.wraps add a standard attribute to point at the underlying function (on the theory that objects should be introspectable). This would allow a standard way to get to the underlying unwrapped functions.
|
msg113580 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2010-08-11 02:38 |
After discussion with RDM on IRC, I’m opening a new report to track this feature request separately. (It’s also a dependency of this bug.)
|
msg113791 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2010-08-13 16:15 |
Have we had any luck getting this to play nicely with the buildbots yet? (I lost track of where the last checkin got to). The necessary Modules/Setup change to adjust when _collections is built should have propagated through by now.
|
msg114647 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2010-08-22 08:51 |
Raymond, out of curiosity, can you tell why you removed lfu_cache?
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:04 | admin | set | github: 53642 |
2010-08-22 08:51:32 | eric.araujo | set | messages:
+ msg114647 |
2010-08-22 08:06:14 | rhettinger | set | status: open -> closed resolution: fixed |
2010-08-13 16:15:11 | ncoghlan | set | messages:
+ msg113791 |
2010-08-11 02:38:21 | eric.araujo | set | assignee: r.david.murray -> rhettinger dependencies:
+ Add attribute pointing to wrapped function in functools.update_wrapper messages:
+ msg113580 |
2010-08-10 23:31:36 | rhettinger | set | status: closed -> open priority: normal -> low
assignee: rhettinger -> r.david.murray
nosy:
+ r.david.murray messages:
+ msg113570 resolution: fixed -> (no value) |
2010-08-09 05:51:02 | ncoghlan | set | messages:
+ msg113385 |
2010-08-09 05:11:46 | rhettinger | set | messages:
+ msg113383 |
2010-08-09 04:52:45 | ncoghlan | set | messages:
+ msg113382 |
2010-08-09 04:29:10 | rhettinger | set | status: open -> closed resolution: fixed messages:
+ msg113378
|
2010-08-09 04:10:06 | eric.araujo | set | nosy:
+ eric.araujo
|
2010-08-09 04:08:58 | ncoghlan | set | messages:
+ msg113374 |
2010-08-08 19:24:07 | rhettinger | set | assignee: rhettinger |
2010-08-08 19:22:59 | rhettinger | set | messages:
+ msg113300 versions:
+ Python 3.2, - Python 3.3 |
2010-08-08 15:44:02 | ncoghlan | set | messages:
+ msg113273 |
2010-08-04 09:48:02 | pitrou | set | nosy:
+ rhettinger
|
2010-08-04 09:29:01 | flox | set | nosy:
+ flox
|
2010-07-28 11:27:10 | ncoghlan | create | |