This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author William Budd
Recipients William Budd, ezio.melotti, mrabarnett
Date 2017-06-21.02:38:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1498012712.75.0.088701596169.issue30720@psf.upfronthosting.co.za>
In-reply-to
Content
pattern = re.compile('<div>(<p>.*?</p>)</div>', flags=re.DOTALL)

----------------------------------------------------------------

# This works as expected in the following case:

print(re.sub(pattern, '\\1',
             '<div><p>foo</p></div>\n'
             '<div><p>bar</p>123456789</div>\n'))

# which outputs:

<p>foo</p>
<div><p>bar</p>123456789</div>

----------------------------------------------------------------

# However, it does NOT work as I expect in this case:

print(re.sub(pattern, '\\1',
             '<div><p>foo</p>123456789</div>\n'
             '<div><p>bar</p></div>\n'))

# actual output:

<p>foo</p>123456789</div>
<div><p>bar</p>

# expected output:

<div><p>foo</p>123456789</div>
<p>bar</p>

----------------------------------------------------------------

It seems that pattern matching/substitution iterations only go haywire once the matching iteration immediately prior to it turned out not to be a match. Maybe some internal variable is not cleaned up properly in an edge(?) case triggered by the example above?
History
Date User Action Args
2017-06-21 02:38:32William Buddsetrecipients: + William Budd, ezio.melotti, mrabarnett
2017-06-21 02:38:32William Buddsetmessageid: <1498012712.75.0.088701596169.issue30720@psf.upfronthosting.co.za>
2017-06-21 02:38:32William Buddlinkissue30720 messages
2017-06-21 02:38:31William Buddcreate