This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Please document fnmatch LRU cache size (256) and suggest alternatives
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: andrei.avk, joshtriplett, lukasz.langa, rhettinger
Priority: normal Keywords: patch

Created on 2020-12-31 22:14 by joshtriplett, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 27084 merged andrei.avk, 2021-07-11 00:00
Messages (5)
msg384141 - (view) Author: Josh Triplett (joshtriplett) Date: 2020-12-31 22:14
fnmatch translates shell patterns to regexes, using an LRU cache of 256 elements. The documentation doesn't mention the cache size, just "They cache the compiled regular expressions for speed.". Without this knowledge, it's possible to get pathologically bad performance by exceeding the cache size.

Please consider adding documentation of the cache size to the module documentation for fnmatch, along with a suggestion to use fnmatch.translate directly if you have more patterns than that.
msg384147 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-01-01 01:56
In addition to documenting the cache size, consider a substantial increase to the limit.   Compiled regex patterns tend to be very small.  We can afford to have a lot of them (tens of thousands seems reasonable to me).

Regarding the suggested alternative, ISTM that calling translate() directly doesn't help much.  For a cache miss, the overhead of the lru_cache() is very small relative to the work done by translate().
msg397252 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-11 00:01
I've put up the PR here: https://github.com/python/cpython/pull/27084/files
msg397537 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-07-15 10:53
New changeset b39eea06d148887dd91a3612febafbddda760593 by andrei kulakov in branch 'main':
bpo-42799: fnmatch module: bump up size of lru_cache for patterns (GH-27084)
https://github.com/python/cpython/commit/b39eea06d148887dd91a3612febafbddda760593
msg397539 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-07-15 10:54
Thanks! ✨ 🍰 ✨
History
Date User Action Args
2022-04-11 14:59:39adminsetgithub: 86965
2021-07-15 10:54:22lukasz.langasetstatus: open -> closed
type: behavior
messages: + msg397539

resolution: fixed
stage: patch review -> resolved
2021-07-15 10:53:29lukasz.langasetnosy: + lukasz.langa
messages: + msg397537
2021-07-11 00:01:43andrei.avksetmessages: + msg397252
2021-07-11 00:00:25andrei.avksetkeywords: + patch
nosy: + andrei.avk

pull_requests: + pull_request25632
stage: patch review
2021-01-01 01:56:16rhettingersetnosy: + rhettinger
messages: + msg384147
2020-12-31 22:14:20joshtriplettcreate