Author ezio.melotti
Recipients MizardX, ezio.melotti, moreati, mrabarnett, serhiy.storchaka, timehorse
Date 2014-08-01.12:51:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1406897460.14.0.659410636273.issue9529@psf.upfronthosting.co.za>
In-reply-to
Content
See also #19536.

I still think that if we do something about these issues, we should try to be compatible with the regex module.

If we are going to add support for both iterability and __getitem__, they should be consistent, so that list(m) == [m[0], m[1], m[N]].
This means that m[0] should be equal to m.group(0), rather than m.group(1).

Currently the Match object of the regex module supports __getitem__ (with m[0] == m.group[0]) but is not iterable:
>>> m = regex.match('([^:]+): (.*)', 'foo: bar')
>>> m[0], m[1], m[2]
('foo: bar', 'foo', 'bar')
>>> len(m)
3
>>> list(m)
TypeError: '_regex.Match' object is not iterable

I can see different possible solutions:
1) do what regex does, have m[X] == m.group(X) and live with m[0] == m.group(0) (this means that unpacking will be "_, key, value = m");
2) have m[0] == m.group(1), which makes unpacking easier, but is inconsistent with both m.group() and with what regex does; *
3) disregard regex compatibility and implement what we think is best;


* since regex already has a few incompatibilities with re, a global flag/function could be added to regex to make it behave like the re module (where possible).  If necessary, the re module could also include and ignore a similar flag/function.  This would make interoperability between the two easier.
History
Date User Action Args
2014-08-01 12:51:00ezio.melottisetrecipients: + ezio.melotti, timehorse, mrabarnett, moreati, MizardX, serhiy.storchaka
2014-08-01 12:51:00ezio.melottisetmessageid: <1406897460.14.0.659410636273.issue9529@psf.upfronthosting.co.za>
2014-08-01 12:51:00ezio.melottilinkissue9529 messages
2014-08-01 12:51:00ezio.melotticreate