Title: PEP 616: Add str methods to remove prefix or suffix
msg363958 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-11 19:11
Following discussion here ( ), there is a proposal to add new methods str.cutprefix and str.cutsuffix to alleviate the common misuse of str.lstrip and str.rstrip.

I think sticking with the most basic possible behavior

    def cutprefix(self: str, prefix: str) -> str:
        if self.startswith(prefix):
            return self[len(prefix):]
        # return a copy to work for bytearrays
        return self[:]

    def cutsuffix(self: str, suffix: str) -> str:
        if self.startswith(suffix):
            # handles the "[:-0]" issue
            return self[:len(self)-len(suffix)]
        return self[:]

would be best (refusing to guess in the face of ambiguous multiple arguments). Someone can do, e.g.

    >>> 'foo.tar.gz'.cutsuffix('.gz').cutsuffix('.tar')

to cut off multiple suffixes. More complicated behavior for multiple arguments could be added later, but it would be easy to make a mistake in prematurely generalizing right now.

In bikeshedding method names, I think that avoiding the word "strip" would be nice so users can have a consistent feeling that "'strip' means character sets; 'cut' means substrings".
msg364020 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-03-12 13:02
To be clear, are you only making a copy of the unchanged object if it is a mutable bytearray, not str or bytes?
msg364028 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-12 16:37

    >>> x = "A"*10**6
    >>> x.cutprefix("B") is x
    >>> x.cutprefix("") is x

    >>> y = b"A"*10**6
    >>> y.cutprefix(b"B") is y
    >>> y.cutprefix(b"") is y

    >>> z = bytearray(b"A")*10**6
    >>> z.cutprefix(b"B") is z
    >>> z.cutprefix(b"") is z

I'm not sure whether this should be part of the spec or an implementation detail. The (str/bytes).replace method docs don't clarify this, but they have the same behavior:

    >>> x = "A"*10**6
    >>> x.replace("B", "C") is x
    >>> x.replace("", "") is x

    >>> y = b"A"*10**6
    >>> y.replace(b"B", b"C") is y
    >>> y.replace(b"", b"") is y

    >>> z = bytearray(b"A")*10**6
    >>> z.replace(b"B", b"C") is z
    >>> z.replace(b"", b"") is z
msg364277 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-03-16 02:52
Guido, do you support this API expansion?
msg364284 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-03-16 04:32
I stopped following the discussion at some point, but I think this is worth adding it -- I have seen this done over and over again, and apparently lots of other people have felt the need too.

I think these names are fine, and about the best we can do (keeping in line with the "feel" of the rest of the string API).

I like the behavior of returning a copy of the string if there's no match (as opposed to failing, which was also brought up).  If the original object is immutable this should return the original object, but that should be considered a CPython optimization (IIRC all the string methods are pretty careful about that), but not required by the spec.

FWIW the pseudo code has a copy/paste error: In cutsuffix() it should use endswith() rather than startswith().
msg364313 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-16 13:04
The proposed change will affect many builtin types: bytes, bytearray, str, but also other types like collections.UserString. Would it make sense to summarize what has been said in the python-ideas thread into a PEP? It may good to specify things like:

    >>> x = "A"*10**6
    >>> x.cutprefix("B") is x

The specification can be just "that's an implementation detail" or "CPython implementation specific" :-)

I don't expect such PEP to be long nor controversial, but it may help to write it down.
msg364581 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-19 01:14
If no one has started, I can draft such a PEP.
msg364582 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-03-19 01:25
Sounds good.
msg364643 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-20 00:37
Here is a draft PEP -- I believe it needs a Core Developer sponsor now?
msg364657 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-20 08:27
The PEP is a good start. Can you try to convert it to a PR on ? It seems like the next available PEP number is 616. I would prefer to leave comments on a PR.
msg364664 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2020-03-20 10:42
IMHO the names don't fit Pythons current naming scheme, so what about naming them "lchop" and "rchop"?
msg364671 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-20 14:23
msg364681 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-20 16:53

Thank you. And good luck for handling incoming discussions on the PEP ;-)
msg364701 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-20 18:28
Where should I leave comments on the PEP? Do you plan to post it on python-dev soon?
msg364703 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-03-20 18:52
Just posted it.
msg365036 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-26 00:25
Dennis Sweeney wrote
