Title: Don't accept a negative number for the count argument in str.replace(old, new[,count])
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: opensource-assist, rhettinger, steven.daprano, xtreak
Priority: normal Keywords:

Created on 2020-01-11 13:42 by opensource-assist, last changed 2020-01-12 16:46 by opensource-assist. This issue is now closed.

Messages (7)
msg359795 - (view) Author: Aurora (opensource-assist) * Date: 2020-01-11 13:42
It's meaningless for the count argument to have a negative value, since there's no such thing as negative count for something.
msg359798 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-01-11 15:04
negative value is an implementation detail where count < 0 is similar to replace all [0]. See also issue5416

msg359805 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-01-11 15:57
This behaviour that goes all the way back to Python 1.5, if not older, before strings even had methods:

    [steve@ando ~]$ python1.5
    Python 1.5.2 (#1, Aug 27 2012, 09:09:18)  [GCC 4.1.2 20080704 
    (Red Hat 4.1.2-52)] on linux2
    Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
    >>> import string
    >>> string.replace("abacadaeaf", "a", "Z", -1)

Hiding the fact that str.replace treats negative values as "replace all" just causes confusion, as people wrongly jump to the conclusion that it is a bug.

It's not a bug, it is a useful feature and it has been in the language for over 20 years. VB.Net has the same feature:

Let's just document it as intentional and be done with it.
msg359807 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-01-11 16:47
Sorry, I disagree that this is a mere implementation detail. The term "implementation detail" normally implies behaviour which occurs *by accident* due to the specific implementation, rather than being intentionally chosen.

A good example is early versions of list.sort(), which was stable for small lists only because the implementation happened to use insertion sort for small lists. Insertion sort wasn't chosen because it was stable; had the implementation changed to another sort, the behaviour would have changed. (Later on, the implementation did change, and stability became a documented and guaranteed feature.)

This is not what is happening here. The behaviour of for negative count doesn't "just happen by accident" due to other, unrelated, choices. It happens because the code intentionally tests for a negative count and replaces it with the maximum value possible:

    if (maxcount < 0)
        maxcount = PY_SSIZE_T_MAX;

and it is documented in the C source code as a comment to argument clinic:

    count: Py_ssize_t = -1
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.

Some more evidence that this is intentional behaviour: in, there are various tests that -1 behaves the same as sys.maxsize, e.g.:

        EQ("ReyKKjaviKK", "Reykjavik", "replace", "k", "KK", -1)
        EQ("ReyKKjaviKK", "Reykjavik", "replace", "k", "KK", sys.maxsize)

That's not an isolated test, there are many of them.
msg359808 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-01-11 16:48
Oops, I meant Lib/test/ not "".
msg359809 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-01-11 16:52
Thanks for the details. I looked into the tests for this behavior too and agree it's a tested behavior. issue5416 already had similar discussion and the documentation was committed to be later reverted upon Raymond's suggestion. So I will leave it to him.
msg359856 - (view) Author: Aurora (opensource-assist) * Date: 2020-01-12 16:46
Understood, just as an aftermath:
I still disagree a little with such an implementation because it's riding way into terse-coding that it's going against the principles of mathematics, which is the basis of computer science and programming.
Python can use another special keyword or something(e.g. the Ellipsis notation) for this and all similar cases.
You'll get into trouble if you wanna explain such a thing to a mathematician or if you wanna write some pseudo-code based on it, which in both cases they're not gonna look at the underlying implementation.
A bad practice in C, followed by CPython spreaded to others.
Date User Action Args
2020-01-12 16:46:12opensource-assistsetmessages: + msg359856
2020-01-12 01:35:11rhettingersetstatus: open -> closed
resolution: not a bug
stage: resolved
2020-01-11 16:52:24xtreaksetmessages: + msg359809
2020-01-11 16:48:52steven.dapranosetmessages: + msg359808
2020-01-11 16:47:32steven.dapranosetmessages: + msg359807
2020-01-11 15:57:54steven.dapranosetnosy: + steven.daprano
messages: + msg359805
2020-01-11 15:04:41xtreaksetnosy: + rhettinger, xtreak
messages: + msg359798
2020-01-11 13:42:38opensource-assistcreate