Title: replace groups doesn't work in this special case
Components: Regular Expressions Versions: Python 2.4
Status: closed Resolution: not a bug
Assigned To: niemeyer Nosy List: georg.brandl, niemeyer, tomek74
Created on 2006-11-06 11:49 by tomek74, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg30467 - (view) Author: Thomas K. (tomek74) Date: 2006-11-06 11:49
If you have a regular expression like this:
matching this string:
1 1a
and replacing with this:
you get what expected:
yx yx

If you replace with this:
you get nothing replaced, because the group \2 
doesn't exist for the pattern "1".
But it does exist for the pattern "1a"!

We have multiple possibilities here:
1.) The string "1" gives no result, because \2 
doesn't exist. The string "1a" gives a result, so the 
output should be: 1a
2.) The sring "1" gives a result, because \2 is 
handled like an empty string. The string "1a" gives a 
result, so the output should be: 1 1a

I think the case that the sring "1" has no results, 
but effects the string "1a" wich would normaly have a 
result, is bad.

What are your thoughts on it?

Test code:
import re

# common variables

rawstr = r"""([0-9])([a-z])?"""
embedded_rawstr = r"""([0-9])([a-z])?"""
matchstr = """1 1a"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj =

# method 2: using search function (w/ external flags)
match_obj =, matchstr)

# method 3: using search function (w/ embedded flags)
match_obj =, matchstr)

# Retrieve group(s) from match_obj
all_groups = match_obj.groups()

# Retrieve group(s) by index
group_1 =
group_2 =

# Replace string
newstr = compile_obj.subn('\1\2', 0)
msg30468 - (view) Author: Gustavo Niemeyer (niemeyer) * (Python committer) Date: 2006-11-06 12:17
Logged In: YES 

Hello Thomas,

I don't understand exactly what you mean here.

This doesn't work:

  >>> re.compile("([0-9])([a-z])?").subn(r"<\1\2>", "1 1a")
  Traceback (most recent call last):
  sre_constants.error: unmatched group

And this works fine:

  >>> re.compile("([0-9])([a-z]?)").subn(r"<\1\2>", "1 1a")
  ('<1> <1a>', 2)

The example code you provided doesn't run here, because
'subn()' is being provided
bad data (check for
docs). It's also
being passed '\1\2', which is really '\x01\x02', and won't
do what you want.
msg30469 - (view) Author: Thomas K. (tomek74) Date: 2006-11-07 10:36
Logged In: YES 

I verified your code. It works for me, too.
msg30470 - (view) Author: Thomas K. (tomek74) Date: 2006-11-08 16:56
Logged In: YES 

I have tried it again with my original regexp and the
searchstring. In this case I have to put the “?” after “)”.

-> RegEx:
([1-9][a-z][a-z][0-9])([ \-\r\n\t]*([0-9])(([0-9])(([0-9])([
IGNORECASE is switched on.

-> ReplaceString:

-> Searchstring 1):


-> Searchstring 2):
6ES5894-0MA03; 6ES5864-0MA03; 6ES5894-0MA63-0UG5; 6ES58860MA03

NO Result!

-> The problem is that I get no results with searchstring 2.

msg30471 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-11-18 19:22
I do get a match with your regex and search string 2.
