This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: string.split maxsplit documented incorrectly
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: Fj, docs@python, ezio.melotti, python-dev
Priority: normal Keywords:

Created on 2012-05-09 11:30 by Fj, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg160273 - (view) Author: Fj (Fj) Date: 2012-05-09 11:30
string.split documentation says:

> The optional third argument maxsplit defaults to 0. If it is nonzero, at most maxsplit number of splits occur, and the remainder of the string is returned as the final element of the list (thus, the list will have at most maxsplit+1 elements).

It lies! If you give it maxsplit=0 it doesn't do any splits at all! It should say:

> The optional third argument maxsplit defaults to **-1**. If it is **nonnegative**, at most maxsplit number of splits occur, ...

Additionally, it could specify default values in the function signature explicitly, like re.split does:

    string.split(s, sep=None, maxsplit=-1)

instead of

    string.split(s, [sep, [maxsplit]])

It seems that the inconsistency stems from the time long forgotten (certainly before 2.5) when string.split used the implementation in stropmodule.c (obsolete), which does indeed uses maxsplit=0 (and on which the re.split convention was based, regrettably).

Currently string.split just calls str.split, and that uses maxsplit=-1 to mean unlimited splits.

From searching "maxsplit" in the bug tracker I understand that split functions have had a rather difficult history and some quirks preserved for the sake of backward compatibility, and not documented for the sake of brevity. In this case, however, the documentation does try to document the particular behaviour, but is wrong, which is really confusing.

Also, maybe an even better fix would be to change the str.split documentation to use the proper signature (`str.split(sep=None, maxsplit=-1)`), and simply say that string.split(s, sep=None, maxsplit=-1) calls s.split(sep, maxsplit) here? Because that's what it does, while having _two_ different, incomplete, partially wrong explanations of the same thing is confusing!
msg160278 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-05-09 11:59
New changeset d3ddbad31b3e by Ezio Melotti in branch '2.7':
#14763: fix documentation for string.split/rsplit.
http://hg.python.org/cpython/rev/d3ddbad31b3e
msg160279 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-05-09 12:03
I fixed the doc for string.split/rsplit.  I didn't change the signature because all the other functions use the old signature convention (the one with []).  These functions are anyway deprecated, so I don't think it's worth spending more time improving their docs (as long as they are not wrong).
msg160284 - (view) Author: Fj (Fj) Date: 2012-05-09 12:18
Thank you.

> These functions are anyway deprecated

Well, yes, but it's the only place you can get information about the default value of maxsplit, short of looking in the source. Which is kind of wrong.

Maybe you can also fix str.split docstring to say "If maxsplit is not specified *or negative*, then there is no limit on the number of splits"?
msg160337 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-05-10 12:33
New changeset 0415ecd7b0e3 by Ezio Melotti in branch '2.7':
#14763: document default maxsplit value for str.split.
http://hg.python.org/cpython/rev/0415ecd7b0e3

New changeset 62659067f5b6 by Ezio Melotti in branch '3.2':
#14763: document default maxsplit value for str.split.
http://hg.python.org/cpython/rev/62659067f5b6

New changeset bcc964092437 by Ezio Melotti in branch 'default':
#14763: merge with 3.2.
http://hg.python.org/cpython/rev/bcc964092437
msg160338 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-05-10 12:36
I now documented it in the documentation of str.split too.  I left the docstrings alone since they don't need to be as exhaustive as the official documentation, and there's normally no reason to use -1 directly.
History
Date User Action Args
2022-04-11 14:57:30adminsetgithub: 58968
2012-05-10 12:36:38ezio.melottisetmessages: + msg160338
2012-05-10 12:33:21python-devsetmessages: + msg160337
2012-05-09 12:18:52Fjsetmessages: + msg160284
2012-05-09 12:03:27ezio.melottisetstatus: open -> closed

type: enhancement
assignee: docs@python -> ezio.melotti

nosy: + ezio.melotti
messages: + msg160279
resolution: fixed
stage: resolved
2012-05-09 11:59:33python-devsetnosy: + python-dev
messages: + msg160278
2012-05-09 11:30:37Fjcreate