Here are two patches which implement two alternative solutions. They are based on regex code.

Dict copying patch matches current regex behavior and needs modifying other code to avoid small slowdown. Artificial example:

$ ./python -m timeit -s 'import re; n = 100; m = re.match("".join("(?P<g%d>.)" % g for g in range(n)), "x" * n); t = ",".join(r"\g<g%d>" % g for g in range(n))' -- 'm.expand(t)'

Without patch: 7.48 msec per loop
With re_groupindex_copy.patch but without modifying _expand: 9.61 msec per loop
With re_groupindex_copy.patch and with modifying _expand: 7.41 msec per loop

While stdlib code can be modified, this patch can cause small slowdown of some third-party code.

Dict proxying patch has no performance effect, but it is slightly less compatible. Some code can accept dict but not dict-like object.
