Issue 44773: case_insensitive kwarg in str.replace()

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/88936

classification

Title:	case_insensitive kwarg in str.replace()
Type:	enhancement	Stage:	resolved
Components:	Parser	Versions:

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	eric.smith, lys.nikolaou, nimit.grover24, rhettinger, serhiy.storchaka
Priority:	normal	Keywords:

Created on 2021-07-29 17:10 by nimit.grover24, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg398501 - (view)	Author: Nimboss (nimit.grover24)	Date: 2021-07-29 17:10
Currently str.replace() has 3 arguments: old, new and count. This issue suggests the new addition of another argument called case_insensitive (type bool, defaulted to False) which determines whether to ignore case when replacing said text or not. Currently we have to use regex or logic (see https://stackoverflow.com/questions/919056/case-insensitive-replace), but it would just be a nice QoL feature to have a case_insensitive kwarg.
msg398520 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2021-07-29 21:29
I don't think this should be done. If case doesn't matter at all, the input can be casefolded before the replacement: s.casefold().replace('hippo', 'giraffe'). If it can't be casefolded in advance because the case actually matters, then it doesn't make sense to mix a case-insensitive search step with a case-sensitive replacement. Presumably if case matters at all in the original string, then the new text would need to match the case of the old text. With the example in the selected StackOverflow answer, we get undesirable output for most of the case variants: strings = [ 'I WANT A HIPPO FOR MY BIRTHDAY!', # lowercase giraffe doesn't fit 'I want a hippo for my birthday.', # only makes sense when the case matches 'I Want A Hippo for My Birthday', # lowercase giraffe doesn't fit 'I want a hIPpo for my birthday', # desired outcome unknown ] for s in strings: print(s.replace('hippo', 'giraffe', case_insensitive=True) ISTM that every answer in the StackOverflow entry has only a toy examples and wouldn't make sense for real text where case is retained everywhere except for the substitutions.
msg398601 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-07-30 18:46
I agree with Raymond that this should be rejected.
msg398605 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2021-07-30 20:05
I concur with Raymond and Eric. Note that in general case the problem is more complex that you may expect. First, some characters can match two characters (e.g. 'ß' matches 'SS'), and therefore indexes of characters are different in different cases. Second, you may want to take to account Unicode normalization, so 'й' will match 'й' (the former is a single character, the latter is two characters 'и'+'\u0306'). The re module will not help with solving the first problem. You should use the third-party regex package. For the second problem you can use unicodedata.normalize(),

History
Date	User	Action	Args
2022-04-11 14:59:48	admin	set	github: 88936
2021-07-30 20:05:07	serhiy.storchaka	set	status: open -> closed nosy: + serhiy.storchaka messages: + msg398605 resolution: rejected stage: resolved
2021-07-30 18:46:22	eric.smith	set	nosy: + eric.smith messages: + msg398601
2021-07-29 21:29:38	rhettinger	set	nosy: + rhettinger messages: + msg398520
2021-07-29 17:39:21	pablogsal	set	nosy: - pablogsal
2021-07-29 17:14:45	nimit.grover24	set	title: case_insensitive kwarg to str.replace() -> case_insensitive kwarg in str.replace()
2021-07-29 17:10:44	nimit.grover24	create