This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: case_insensitive kwarg in str.replace()
Type: enhancement Stage: resolved
Components: Parser Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, lys.nikolaou, nimit.grover24, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2021-07-29 17:10 by nimit.grover24, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg398501 - (view) Author: Nimboss (nimit.grover24) Date: 2021-07-29 17:10
Currently str.replace() has 3 arguments: old, new and count.
This issue suggests the new addition of another argument called case_insensitive (type bool, defaulted to False) which determines whether to ignore case when replacing said text or not.

Currently we have to use regex or logic (see https://stackoverflow.com/questions/919056/case-insensitive-replace), but it would just be a nice QoL feature to have a case_insensitive kwarg.
msg398520 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-07-29 21:29
I don't think this should be done.  

If case doesn't matter at all, the input can be casefolded before the replacement:

  s.casefold().replace('hippo', 'giraffe').  

If it can't be casefolded in advance because the case actually matters, then ​it doesn't make sense to mix a case-insensitive search step with a case-sensitive replacement.

Presumably if case matters at all in the original string, then the new text would need to match the case of the old text.  With the example in the selected StackOverflow answer, we get undesirable output for most of the case variants:

 ​strings = [
    ​'I WANT A HIPPO FOR MY BIRTHDAY!',  # lowercase giraffe doesn't fit
    ​'I want a hippo for my birthday.',  # only makes sense when the case matches
    ​'I Want A Hippo for My Birthday',   # lowercase giraffe doesn't fit
    ​'I want a hIPpo for my birthday',   # desired outcome unknown
 ​]
 ​for s in strings:
     ​print(s.replace('hippo', 'giraffe', case_insensitive=True)

ISTM that every answer in the StackOverflow entry has only a toy examples and wouldn't make sense for real text where case is retained everywhere except for the substitutions.
msg398601 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-07-30 18:46
I agree with Raymond that this should be rejected.
msg398605 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-07-30 20:05
I concur with Raymond and Eric.

Note that in general case the problem is more complex that you may expect. First, some characters can match two characters (e.g. 'ß' matches 'SS'), and therefore indexes of characters are different in different cases. Second, you may want to take to account Unicode normalization, so 'й' will match 'й' (the former is a single character, the latter is two characters 'и'+'\u0306').

The re module will not help with solving the first problem. You should use the third-party regex package. For the second problem you can use unicodedata.normalize(),
History
Date User Action Args
2022-04-11 14:59:48adminsetgithub: 88936
2021-07-30 20:05:07serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg398605

resolution: rejected
stage: resolved
2021-07-30 18:46:22eric.smithsetnosy: + eric.smith
messages: + msg398601
2021-07-29 21:29:38rhettingersetnosy: + rhettinger
messages: + msg398520
2021-07-29 17:39:21pablogsalsetnosy: - pablogsal
2021-07-29 17:14:45nimit.grover24settitle: case_insensitive kwarg to str.replace() -> case_insensitive kwarg in str.replace()
2021-07-29 17:10:44nimit.grover24create