Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re.match blocking and taking 100% CPU #60634

Closed
SebastienEstienne mannequin opened this issue Nov 7, 2012 · 3 comments
Closed

re.match blocking and taking 100% CPU #60634

SebastienEstienne mannequin opened this issue Nov 7, 2012 · 3 comments

Comments

@SebastienEstienne
Copy link
Mannequin

SebastienEstienne mannequin commented Nov 7, 2012

BPO 16430
Nosy @mdickinson, @ezio-melotti, @serhiy-storchaka
Superseder
  • bpo-1662581: the re module can perform poorly: O(2n) versus O(n2)
  • Files
  • re_bug.py: Example of the bug
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-11-07.17:34:32.818>
    created_at = <Date 2012-11-07.17:00:35.683>
    labels = ['expert-regex', 'invalid']
    title = 're.match blocking and taking 100% CPU'
    updated_at = <Date 2012-11-07.17:35:54.298>
    user = 'https://bugs.python.org/SebastienEstienne'

    bugs.python.org fields:

    activity = <Date 2012-11-07.17:35:54.298>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-11-07.17:34:32.818>
    closer = 'serhiy.storchaka'
    components = ['Regular Expressions']
    creation = <Date 2012-11-07.17:00:35.683>
    creator = 'Sebastien.Estienne'
    dependencies = []
    files = ['27921']
    hgrepos = []
    issue_num = 16430
    keywords = []
    message_count = 3.0
    messages = ['175109', '175111', '175112']
    nosy_count = 5.0
    nosy_names = ['mark.dickinson', 'ezio.melotti', 'mrabarnett', 'serhiy.storchaka', 'Sebastien.Estienne']
    pr_nums = []
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = '1662581'
    type = None
    url = 'https://bugs.python.org/issue16430'
    versions = ['Python 2.6', 'Python 2.7']

    @SebastienEstienne
    Copy link
    Mannequin Author

    SebastienEstienne mannequin commented Nov 7, 2012

    Hello

    re.match is blocked and takes 100% cpu forever

    re_bug.py is an example of the bug.

    thanx

    @SebastienEstienne SebastienEstienne mannequin added the topic-regex label Nov 7, 2012
    @serhiy-storchaka
    Copy link
    Member

    There is no a Python bug.

    You have wrong regexp. Remove "$" at the end or add ".*" before "$".

    And it would be better if instead of '(?P<date>.*?)\s' and '"(?P<method_uri>.*?)"' you use '(?P<date>\S+)\s' and '"(?P<method_uri>[^"]*)"'.

    @mdickinson
    Copy link
    Member

    This is a known issue: there are a good few duplicates in the tracker. Issue bpo-1662581 is one, for example.

    In this particular case, you can probably fix things by tightening up your regex. Part of the problem is that '.*' is going to match any sequence of characters, including spaces. Judicious use of '\S' to match non-whitespace characters might help. There's not much point to the '?' in '.*?', either.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants