classification
Title: Avoid string copy when split char doesn't match
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, skip.montanaro
Priority: normal Keywords: patch

Created on 2007-12-02 07:51 by skip.montanaro, last changed 2007-12-08 15:34 by skip.montanaro. This issue is now closed.

Files
File name Uploaded Description Edit
string-split.patch skip.montanaro, 2007-12-04 01:03
Messages (4)
msg58081 - (view) Author: Skip Montanaro (skip.montanaro) * Date: 2007-12-02 07:51
The topic of avoiding string copies in certain string methods came up in 
the
ChiPy list:

  http://mail.python.org/pipermail/chicago/2007-December/002975.html.

The attached patch modifies the split and rsplit implementations to 
avoid
making a copy of self when the split fails to find anything to split on:

    >>> s = "abc def"
    >>> x = s.split(';')
    >>> x[0] is s
    True
    >>> y = s.rsplit('-')
    >>> y[0] is s
    True
    >>> t = "abcdef"
    >>> x = t.split()
    >>> x[0] is t
    True
    >>> y = t.rsplit()
    >>> y[0] is t
    True

All tests pass.  Given that this is just a small optimization I
don't believe any changes to the docs or the existing tests are
necessary.
msg58145 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-03 19:23
I think this is not quite right; you shouldn't return self unless it is
an exact string instance.  If it is a subclass of PyString should make a
copy.
msg58169 - (view) Author: Skip Montanaro (skip.montanaro) * Date: 2007-12-04 01:03
I'm not sure why a string subclass shouldn't work, but I've attached a new 
version of the patch that calls PyString_CheckExact() to prevent using a 
string subclass.
msg58295 - (view) Author: Skip Montanaro (skip.montanaro) * Date: 2007-12-08 15:34
In the absence of any more feedback, I checked this in as r59420.
History
Date User Action Args
2007-12-08 15:34:09skip.montanarosetstatus: open -> closed
resolution: accepted
messages: + msg58295
2007-12-04 01:03:45skip.montanarosetfiles: - string-split.patch
2007-12-04 01:03:38skip.montanarosetfiles: + string-split.patch
messages: + msg58169
2007-12-03 19:23:33gvanrossumsetnosy: + gvanrossum
messages: + msg58145
2007-12-02 07:51:02skip.montanarocreate