Author gvanrossum
Recipients gvanrossum
Date 2012-10-11.21:30:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1349991042.0.0.875123830426.issue16203@psf.upfronthosting.co.za>
In-reply-to
Content
I've noticed a subtle bug in some of our internal code.  Someone wants to ensure that a certain string (e.g. a URL path) matches a certain pattern in its entirety.  They use re.match() with a regex ending in $.  Fine.  Now someone else comes along and modifies the pattern.  Somehow the $ gets lost, or the pattern develops a set of toplevel choices that don't all end in $.  And now things that have a valid *prefix* suddenly (and unintentionally) start matching.

I think this is a common enough issue and propose a new API: a fullmatch() function and method that work just like the existing match() function and method but also check that the whole input string matches.  This can be implemented slightly awkwardly as follows in user code:

def fullmatch(regex, input, flags=0):
  m = re.match(regex, input, flags)
  if m is not None and m.end() == len(input):
    return m
  return None

(The corresponding method will have to be somewhat more complex because the underlying match() method takes optional pos and endpos arguments.)
History
Date User Action Args
2012-10-11 21:30:42gvanrossumsetrecipients: + gvanrossum
2012-10-11 21:30:42gvanrossumsetmessageid: <1349991042.0.0.875123830426.issue16203@psf.upfronthosting.co.za>
2012-10-11 21:30:41gvanrossumlinkissue16203 messages
2012-10-11 21:30:41gvanrossumcreate