Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please document fnmatch LRU cache size (256) and suggest alternatives #86965

Closed
joshtriplett mannequin opened this issue Dec 31, 2020 · 5 comments
Closed

Please document fnmatch LRU cache size (256) and suggest alternatives #86965

joshtriplett mannequin opened this issue Dec 31, 2020 · 5 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@joshtriplett
Copy link
Mannequin

joshtriplett mannequin commented Dec 31, 2020

BPO 42799
Nosy @rhettinger, @ambv, @akulakov
PRs
  • bpo-42799: fnmatch module: bump up size of lru_cache for patterns #27084
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-07-15.10:54:22.305>
    created_at = <Date 2020-12-31.22:14:20.662>
    labels = ['type-bug', 'library', '3.9']
    title = 'Please document fnmatch LRU cache size (256) and suggest alternatives'
    updated_at = <Date 2021-07-15.10:54:22.304>
    user = 'https://bugs.python.org/joshtriplett'

    bugs.python.org fields:

    activity = <Date 2021-07-15.10:54:22.304>
    actor = 'lukasz.langa'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-07-15.10:54:22.305>
    closer = 'lukasz.langa'
    components = ['Library (Lib)']
    creation = <Date 2020-12-31.22:14:20.662>
    creator = 'joshtriplett'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42799
    keywords = ['patch']
    message_count = 5.0
    messages = ['384141', '384147', '397252', '397537', '397539']
    nosy_count = 4.0
    nosy_names = ['rhettinger', 'joshtriplett', 'lukasz.langa', 'andrei.avk']
    pr_nums = ['27084']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue42799'
    versions = ['Python 3.9']

    @joshtriplett
    Copy link
    Mannequin Author

    joshtriplett mannequin commented Dec 31, 2020

    fnmatch translates shell patterns to regexes, using an LRU cache of 256 elements. The documentation doesn't mention the cache size, just "They cache the compiled regular expressions for speed.". Without this knowledge, it's possible to get pathologically bad performance by exceeding the cache size.

    Please consider adding documentation of the cache size to the module documentation for fnmatch, along with a suggestion to use fnmatch.translate directly if you have more patterns than that.

    @joshtriplett joshtriplett mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir labels Dec 31, 2020
    @rhettinger
    Copy link
    Contributor

    In addition to documenting the cache size, consider a substantial increase to the limit. Compiled regex patterns tend to be very small. We can afford to have a lot of them (tens of thousands seems reasonable to me).

    Regarding the suggested alternative, ISTM that calling translate() directly doesn't help much. For a cache miss, the overhead of the lru_cache() is very small relative to the work done by translate().

    @akulakov
    Copy link
    Contributor

    I've put up the PR here: https://github.com/python/cpython/pull/27084/files

    @ambv
    Copy link
    Contributor

    ambv commented Jul 15, 2021

    New changeset b39eea0 by andrei kulakov in branch 'main':
    bpo-42799: fnmatch module: bump up size of lru_cache for patterns (GH-27084)
    b39eea0

    @ambv
    Copy link
    Contributor

    ambv commented Jul 15, 2021

    Thanks! ✨ 🍰 ✨

    @ambv ambv closed this as completed Jul 15, 2021
    @ambv ambv added the type-bug An unexpected behavior, bug, or error label Jul 15, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants