This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author WayneD
Recipients WayneD, ezio.melotti, mrabarnett
Date 2020-03-20.17:21:13
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1584724873.49.0.294512015302.issue40027@roundup.psfhosted.org>
In-reply-to
Content
There is an inconsistency in re.sub() when substituting at the end of a string using a prior match with a '*' qualifier: the substitution now occurs twice.  For example:

txt = re.sub(r'\s*\Z', "\n", txt)

This should work like txt.rstrip() + "\n", but beginning in 3.7, the re.sub version now matches twice and changes any non-empty whitespace into "\n\n" instead of "\n". (If there is no trailing whitespace it only matches once.)

The bug is the same if '$' is used instead of '\Z', but it does not happen if an actual character is specified (e.g. a substitution of r'\s*x' does not substitute twice if x has preceding whitespace).

I tested 2.7.17, 3.6.9, 3.7.7, 3.8.2, and 3.9.0a4, and it starts to fail in 3.7.7 and beyond.

Attached is a test program.
History
Date User Action Args
2020-03-20 17:21:13WayneDsetrecipients: + WayneD, ezio.melotti, mrabarnett
2020-03-20 17:21:13WayneDsetmessageid: <1584724873.49.0.294512015302.issue40027@roundup.psfhosted.org>
2020-03-20 17:21:13WayneDlinkissue40027 messages
2020-03-20 17:21:13WayneDcreate