Title: Please document fnmatch LRU cache size (256) and suggest alternatives
Components: Library (Lib) Versions: Python 3.9
Nosy List: andrei.avk, joshtriplett, lukasz.langa, rhettinger
Created on 2020-12-31 22:14 by joshtriplett, last changed 2022-04-11 14:59 by admin.

PR 27084 merged andrei.avk, 2021-07-11 00:00
msg384141 - (view) Author: Josh Triplett (joshtriplett) Date: 2020-12-31 22:14
fnmatch translates shell patterns to regexes, using an LRU cache of 256 elements. The documentation doesn't mention the cache size, just "They cache the compiled regular expressions for speed.". Without this knowledge, it's possible to get pathologically bad performance by exceeding the cache size.

Please consider adding documentation of the cache size to the module documentation for fnmatch, along with a suggestion to use fnmatch.translate directly if you have more patterns than that.
msg384147 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-01 01:56
In addition to documenting the cache size, consider a substantial increase to the limit.   Compiled regex patterns tend to be very small.  We can afford to have a lot of them (tens of thousands seems reasonable to me).

Regarding the suggested alternative, ISTM that calling translate() directly doesn't help much.  For a cache miss, the overhead of the lru_cache() is very small relative to the work done by translate().
msg397252 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-11 00:01
I've put up the PR here:
msg397537 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-07-15 10:53
New changeset b39eea06d148887dd91a3612febafbddda760593 by andrei kulakov in branch 'main':
bpo-42799: fnmatch module: bump up size of lru_cache for patterns (GH-27084)
msg397539 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-07-15 10:54
Thanks! ✨ 🍰 ✨
