Message322771
I am experiencing and issue with the following regex when using finditer.
(?=<(?P<tag>\w+)/?>(?:(?P<text>.+?)</(?P=tag)>)?)", "<test><foo2/></test>
(I know it's not the best method of dealing with HTML, and this is a simplified version)
For example:
[m.groupdict() for m in re.finditer(r"(?=<(?P<tag>\w+)/?>(?:(?P<text>.+?)</(?P=tag)>)?)", "<test><foo2/></test>")]
In Python 2.7, 3.5, and 3.6 it returns
[{'tag': 'test', 'text': '<foo2/>'}, {'tag': 'foo2', 'text': None}]
But starting with 3.7 it returns
[{'tag': 'test', 'text': '<foo2/>'}, {'tag': 'foo2', 'text': '<foo2/>'}]
The "text" group appears to be a copy of the previous "text" group.
Some other examples:
"<test>Hello</test><foo/>" => [{'tag': 'test', 'text': 'Hello'}, {'tag': 'foo', 'text': 'Hello'}] (expected: [{'tag': 'test', 'text': 'Hello'}, {'tag': 'foo', 'text': None}])
"<test>Hello</test><foo/><foo/>" => [{'tag': 'test', 'text': 'Hello'}, {'tag': 'foo', 'text': 'Hello'}, {'tag': 'foo', 'text': None}] (expected: [{'tag': 'test', 'text': 'Hello'}, {'tag': 'foo', 'text': None}, {'tag': 'foo', 'text': None}]) |
|
Date |
User |
Action |
Args |
2018-07-31 13:11:05 | beardypig | set | recipients:
+ beardypig, ezio.melotti, mrabarnett |
2018-07-31 13:11:05 | beardypig | set | messageid: <1533042665.14.0.56676864532.issue34294@psf.upfronthosting.co.za> |
2018-07-31 13:11:05 | beardypig | link | issue34294 messages |
2018-07-31 13:11:05 | beardypig | create | |
|