This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author yetingli
Recipients orsenthil, serhiy.storchaka, vstinner, yetingli
Date 2021-03-14.08:50:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1615711854.11.0.172523061339.issue43075@roundup.psfhosted.org>
In-reply-to
Content
Sorry for the delay. I analyzed the performance of the current version '(?:^|,)[ \t]*([^ \t]+)[ \t]+' and the fixed version '(?:^|,)[ \t]*([^ \t,]+)[ \t]+'. I ran the following HTTP header ten times:

header = '' + ',' * (10 ** 5)

The current version takes about 139.178s-140.946s, while the repaired version takes about 0.006s.

You can analyze them with the code below.

    from time import perf_counter
    for _ in range(0, 10):
        BEGIN = perf_counter()
        header = repeat_10_5_simple
        headers = Headers(header)
        handler.http_error_auth_reqed("WWW-Authenticate", host, req, Headers(header))
        DURATION = perf_counter() - BEGIN
        print(f"took {DURATION} seconds!") 

For CVE-2020-8492, it is the backtracking performance caused by some ambiguity during the matching, and this issue is caused by the regex engine constantly moves the matching regex across the malicious string that does not have a match for the regex.

Because the locations of the vulnerabilities are the same, so I refer to your code. Thanks for the code ;-)!
History
Date User Action Args
2021-03-14 08:50:54yetinglisetrecipients: + yetingli, orsenthil, vstinner, serhiy.storchaka
2021-03-14 08:50:54yetinglisetmessageid: <1615711854.11.0.172523061339.issue43075@roundup.psfhosted.org>
2021-03-14 08:50:54yetinglilinkissue43075 messages
2021-03-14 08:50:53yetinglicreate