This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author chris.jerdonek
Recipients chris.jerdonek, docs@python, jcea, ncoghlan, pitrou, r.david.murray
Date 2012-08-05.17:32:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1344187937.0.0.747301031876.issue15554@psf.upfronthosting.co.za>
In-reply-to
Content
> I wasn't really happy with the addition of that sentence about split in the first place.

I think the instinct to put that sentence in there is a good one.  It is a key, perhaps subtle difference.

> I don't understand what your splitlines examples are trying to say, they all look clear to me based on the fact that we are splitting *lines*.  

I perhaps included too many examples and so clouded my point. :)  I just needed one.  The examples were simply to show why the existing language is not correct.  The current language says, "if the string ends with line boundary characters the returned list does not have an empty last element."

However, the examples are of strings that do end with line boundary characters but that *do* have an empty last element.

The point is that splitlines() does not count a terminal line break as an additional line, while split('\n') (for example) does.  But this is different from whether the returned list *has* an empty last element, which is what the current language says.

The returned list can have empty last elements because of line breaks at the end.  It's just that the one at the *very* end doesn't count towards that -- unlike the case for split():

>>> 'a'.splitlines()
['a']
>>> 'a\n'.splitlines()
['a']
>>> 'a\n\n'.splitlines()
['a', '']
>>> 'a\n\n\n'.splitlines()
['a', '', '']
>>> 'a\n\n\n'.split('\n')  # counts terminal line break as an extra line
['a', '', '', '']

I'm open to improving the language.  Maybe "does not count a terminal line break as an additional line" instead of the original "a terminal line break does not delimit an additional empty line"?

> There's another issue for creating a central description of universal-newline parsing, perhaps this entry could link to that discussion (and that discussion could perhaps mention splitlines).

I created that issue (issue 15543), and a patch is in the works along the lines you suggest. ;)

> The split behavior without a specified separator might actually be a bug (if so, it is not a fixable one), but in any case you are right that that clarification should be added if the existing sentence is kept.

Perhaps, but at least split() documents the behavior. :)

"runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace."

(from http://docs.python.org/dev/library/stdtypes.html#str.split )
History
Date User Action Args
2012-08-05 17:32:17chris.jerdoneksetrecipients: + chris.jerdonek, jcea, ncoghlan, pitrou, r.david.murray, docs@python
2012-08-05 17:32:17chris.jerdoneksetmessageid: <1344187937.0.0.747301031876.issue15554@psf.upfronthosting.co.za>
2012-08-05 17:32:16chris.jerdoneklinkissue15554 messages
2012-08-05 17:32:15chris.jerdonekcreate