This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients ezio.melotti, verdy_p
Date 2009-10-14.21:36:24
SpamBayes Score 6.1062266e-16
Marked as misclassified No
Message-id <1255556185.57.0.645582598879.issue7132@psf.upfronthosting.co.za>
In-reply-to
Content
I'm skeptical about what you are proposing for the following reasons:
1) it doesn't exist in any other implementation that I know;
2) if implemented as default behavior:
   * it won't be backward-compatible;
   * it will increase the complexity;
3) it will be a proprietary extension and it will reduce the
compatibility with other implementations;
4) I can't think to any real word situation where this would be really
useful.

Using a flag like re.R to change the behavior might solve the issue 2),
but I'll explain why I don't think this is useful.

Let's take a simpler ipv4 address as example: you may want to use
'^(\d{1,3})(?:\.(\d{1,3})){3}$' to capture the digits (without checking
if they are in range(256)).
This currently only returns:
>>> re.match('^(\d{1,3})(?:\.(\d{1,3})){3}$', '192.168.0.1').groups()
('192', '1')

If I understood correctly what you are proposing, you would like it to
return (['192'], ['168', '0', '1']) instead. This will also require an
additional step to join the two lists to get the list with the 4 values.

In these situations where some part is repeating, it's usually easier to
use re.findall() or re.split() (or just a plain str.split for simple
cases like this):
>>> addr = '192.168.0.1'
>>> re.findall('(?:^|\.)(\d{1,3})', addr)
['192', '168', '0', '1']
>>> re.split('\.', addr) # no need to use re.split here
['192', '168', '0', '1']

In both the examples a single step is enough to get what you want
without changing the existing behavior.

'^(\d{1,3})(?:\.(\d{1,3})){3}$' can still be used to check if the string
has the right "format", before using the other methods to extract the data.

So I'm -1 about the whole idea and -0.8 about an additional flag.
Maybe you should discuss about this on the python-ideas ML.
History
Date User Action Args
2009-10-14 21:36:25ezio.melottisetrecipients: + ezio.melotti, verdy_p
2009-10-14 21:36:25ezio.melottisetmessageid: <1255556185.57.0.645582598879.issue7132@psf.upfronthosting.co.za>
2009-10-14 21:36:24ezio.melottilinkissue7132 messages
2009-10-14 21:36:24ezio.melotticreate