This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mattchaput
Recipients mattchaput
Date 2011-08-31.17:29:33
SpamBayes Score 0.00013855082
Marked as misclassified No
Message-id <1314811774.34.0.509713448629.issue12870@psf.upfronthosting.co.za>
In-reply-to
Content
Several times in the recent past I've wished for the following methods on the regular expression object. These would allow me to speed up search and parsing code, by limiting the number of regex matches I need to try.

literal_prefix(): Returns any literal string at the start of the pattern (before any "special" parts). E.g., for the pattern "ab(c|d)ef" the method would return "ab". For the pattern "abc|def" the method would return "". When matching a regex against keys in a btree, this would let me limit the search to just the range of keys with the prefix.

first_chars(): Returns a string/list/set/whatever of the possible first characters that could appear at the start of a matching string. E.g. for the pattern "ab(c|d)ef" the method would return "a". For the pattern "[a-d]ef" the method would return "abcd". When parsing a string with regexes, this would let me only have to test the regexes that could match at the current character.

As long as you're making a new regex package, I thought I'd put in a request for these :)
History
Date User Action Args
2011-08-31 17:29:34mattchaputsetrecipients: + mattchaput
2011-08-31 17:29:34mattchaputsetmessageid: <1314811774.34.0.509713448629.issue12870@psf.upfronthosting.co.za>
2011-08-31 17:29:33mattchaputlinkissue12870 messages
2011-08-31 17:29:33mattchaputcreate