classification
Title: split() string method has two splitting algorithms
Type: enhancement Stage:
Components: None Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: crackwitz, mcherm, rhettinger
Priority: normal Keywords:

Created on 2005-11-28 05:13 by crackwitz, last changed 2005-12-28 15:55 by mcherm. This issue is now closed.

Messages (3)
msg54685 - (view) Author: crackwitz (crackwitz) Date: 2005-11-28 05:13
The docs of Python 2.4.2 for .split([sep [,maxsplit]]) say:
"If sep is not specified or is None, a different
splitting algorithm is applied."
I would like to see that behavior exposed and
consistent, i.e. stripping (new key strip=...?)
independent of whether sep is None or not.
Making it consistent could break existing code because
people already built on split()'s special behavior.
You could say strip=None by default and only keep
switching if strip==None.
I don't like this magic behavior though because there's
no reason for it to exist.

# this is now (Python v2.4.2)
' foo  bar '.split() # => ['foo', 'bar']
' foo  bar '.split(' ') # => ['', 'foo', '', 'bar', '']

# this is how I would like it to be
' foo  bar '.split(strip=True) # => ['foo', 'bar']
' foo  bar '.split(strip=False) # => ['', 'foo', '',
'bar', '']

# compatibility preserved (strip=None by default):
' foo  bar '.split(strip=None) # => ['foo', 'bar']
' foo  bar '.split(' ', strip=None) # => ['', 'foo',
'', 'bar', '']

what do you think?
msg54686 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-11-30 04:10
Logged In: YES 
user_id=80475

While unfortunate, this function already suffers from excess
complexity.  Changing the behavior and adding backwards
compatability features would make the situation worse.

Also, the two different behaviors were not accidental. 
Someone put them in for a reason.  There may be orthogonal
use cases.  Ideally, that need would have been been through
two different functions/methods.  But, if you change the
behavior, you're likely breaking an entire class of use cases.

So, I'm -1 on mucking with this prior to Py3.0.
msg54687 - (view) Author: Michael Chermside (mcherm) (Python triager) Date: 2005-12-28 15:55
Logged In: YES 
user_id=99874

I'm -1 on changing it at all. I do understand how the
unnecessary inconsistancy can be grating, but it is clearly
documented, it is backward compatible, and it is useful.
Pragmatism wins out over theoretical consistancy in my opinion.
History
Date User Action Args
2005-11-28 05:13:00crackwitzcreate