This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re.sub() library entry does not adequately document surprising change in behavior between versions
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: berker.peksag, brett.cannon, docs@python, josh.r, miss-islington, mollison, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-04-16 22:42 by mollison, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 12879 merged mollison, 2019-04-19 01:22
PR 12898 merged miss-islington, 2019-04-21 22:15
Messages (9)
msg340370 - (view) Author: (mollison) * Date: 2019-04-16 22:42
This is regarding the change to re.sub() between 3.6 and 3.7 that results in different behavior even for simple cases like the following:

re.sub('a*','b', 'a') returns 'b' in 3.6 and 'bb' in 3.7

This change is well documented here:
https://docs.python.org/3/whatsnew/3.7.html#changes-in-the-python-api

However, it is not well documented here:
https://docs.python.org/3.7/library/re.html

The latter document does actually contain the appropriate text: "Empty matches for the pattern are replaced when adjacent to a previous non-empty match."

However, the formatting makes this text look like it was always there, and is not part of the 3.7 changes announcement.

That is how I interpreted it, leading to some lost productivity.

After so many years, people don't expect the regex engine to change like this, and that only makes it easier to misinterpret that text as always have been there vs. being new to 3.7.

Related:
https://bugs.python.org/issue32308
msg340374 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-04-17 00:49
I believe the second note under "Changed in 3.7" is intended to address this:

> Empty matches for the pattern are replaced when adjacent to a previous non-empty match.

Obviously not as detailed as the What's New entry, but it's there.
msg340376 - (view) Author: (mollison) * Date: 2019-04-17 01:25
You have not understood my message. I know that text is already there.

My point is that because it's in a separate paragraph, it looks like it's not part of Changed in 3.7.

There is nowhere else on the page where a change description is in a separate paragraph from the "Changed in version X.X:" text.

When I came across this I thought, this probably goes with the paragraph before it, right? In other words, it's probably another change from 3.7, just in a different paragraph?

But then I thought, no, they probably wouldn't change something like that (i.e. basic regex stuff) after so many years.

So it's not that obvious.
msg340423 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2019-04-17 17:48
@mollison: would you like to open a PR w/ how you would expect it to be formatted?
msg340447 - (view) Author: (mollison) * Date: 2019-04-17 21:01
@brett.cannon: Yes, I will submit a PR. I have the code ready to go already. I just submitted the Python contributor agreement. I think it will take the system a day or two to register that. Then, I will submit the PR.
msg340519 - (view) Author: (mollison) * Date: 2019-04-19 01:33
@brett.cannon: PR is at https://github.com/python/cpython/pull/12879
msg340622 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2019-04-21 22:14
New changeset 5ebfa840a1c9967da299356733da41b532688988 by Berker Peksag (mollison) in branch 'master':
bpo-36645: Fix ambiguous formatting in re.sub() documentation (GH-12879)
https://github.com/python/cpython/commit/5ebfa840a1c9967da299356733da41b532688988
msg340623 - (view) Author: miss-islington (miss-islington) Date: 2019-04-21 22:20
New changeset 71b88827f6ad368eafa17983bd979175d24da888 by Miss Islington (bot) in branch '3.7':
bpo-36645: Fix ambiguous formatting in re.sub() documentation (GH-12879)
https://github.com/python/cpython/commit/71b88827f6ad368eafa17983bd979175d24da888
msg340624 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2019-04-21 22:57
Thank you!
History
Date User Action Args
2022-04-11 14:59:14adminsetgithub: 80826
2019-04-21 22:57:32berker.peksagsetstatus: open -> closed
versions: + Python 3.8
type: behavior
messages: + msg340624

resolution: fixed
stage: patch review -> resolved
2019-04-21 22:20:47miss-islingtonsetnosy: + miss-islington
messages: + msg340623
2019-04-21 22:15:10miss-islingtonsetpull_requests: + pull_request12824
2019-04-21 22:14:50berker.peksagsetnosy: + berker.peksag
messages: + msg340622
2019-04-19 01:33:26mollisonsetmessages: + msg340519
2019-04-19 01:22:08mollisonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request12803
2019-04-18 06:24:25xtreaksetnosy: + serhiy.storchaka
2019-04-17 21:01:20mollisonsetmessages: + msg340447
2019-04-17 17:48:46brett.cannonsetnosy: + brett.cannon
messages: + msg340423
2019-04-17 01:25:00mollisonsetmessages: + msg340376
2019-04-17 00:49:55josh.rsetnosy: + josh.r
messages: + msg340374
2019-04-16 22:42:59mollisoncreate