Author Nikker
Recipients BMintern, Nikker, effbot, ezio.melotti, gerardjp, mchaput, mrabarnett, nneonneo, terry.reedy, timehorse
Date 2012-03-15.22:02:48
SpamBayes Score 3.91262e-07
Marked as misclassified No
Message-id <1331848969.61.0.13393632844.issue1519638@psf.upfronthosting.co.za>
In-reply-to
Content
I'm having the same issue as the original author of this issue was.  The workaround does not apply to the situation where the captured text is on one side of an "or" grouping, rather than just being optional. 

I'm trying to remove groups of text in parentheses that come at the end of a string, but if the content in a pair of parentheses is a number, I want to retain it.  My regular expression looks like so:

These work:
>>> re.sub(r'(?:\((?:(\d+)|.*?)\)\s*)+$','\\1','avatar (2009)')
'avatar 2009'
>>> re.sub(r'(?:\((?:(\d+)|.*?)\)\s*)+$','\\1','avatar (2009) (special edition)')
'avatar 2009'

This doesn't:
>>> re.sub(r'(?:\((?:(\d+)|.*?)\)\s*)+$','\\1','avatar (special Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/re.py", line 151, in sub
    return _compile(pattern, 0).sub(repl, string, count)
  File "/usr/lib/python2.6/re.py", line 278, in filter
    return sre_parse.expand_template(template, match)
  File "/usr/lib/python2.6/sre_parse.py", line 793, in expand_template
    raise error, "unmatched group"
sre_constants.error: unmatched groupedition)')


Is there some way I can apply this workaround to this situation?
History
Date User Action Args
2012-03-15 22:02:50Nikkersetrecipients: + Nikker, effbot, terry.reedy, mchaput, nneonneo, timehorse, BMintern, ezio.melotti, mrabarnett, gerardjp
2012-03-15 22:02:49Nikkersetmessageid: <1331848969.61.0.13393632844.issue1519638@psf.upfronthosting.co.za>
2012-03-15 22:02:49Nikkerlinkissue1519638 messages
2012-03-15 22:02:48Nikkercreate