from datetime import datetime import re """ Brief summary: Seems to be an expression consuming o(c^n) CPU cycles with c around 2. Regex: (\*([^*]+)*\*) Seems to behave the same way without the parentheses. Evaluated with stuffing "* is there anybody out?" with digits. Resulted in times (in seconds) of 0.17 (* is there anybody out?) 0.34 1 0.69 12 1.36 123 2.73 1234 5.44 12345 11.1 123456 (* is there anybody123456 out?) (Inaccurate single run/repeat measurements, with a warmed up CPU, but the trend seems clear. Used Python 3.8.10, x64, Linux build. Same goes to trying this in PyCharm 3.7.something to search for the regex in a text file that contains that very line :) ) """ def test_match(expr, text, multiline:bool=True): flags = 0 if multiline: flags = re.MULTILINE start = datetime.utcnow() match = re.match(expr, text, flags=flags) print("Took ", (datetime.utcnow() - start).total_seconds(), "seconds,") print("matched" if match else "did not match") def test(): expr = "(\*([^*]+)*\*)" text_l = "* is there anybody" text_r = " out?" stuff = "" for i in range(7): if i > 0: stuff += str(i) text = text_l + stuff + text_r print("Testing", text) test_match(expr, text) test()