classification
Title: sum() function docstring lists arguments incorrectly
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Mariatta Nosy List: Mariatta, docs@python, r.david.murray, rhettinger, serhiy.storchaka, vsinitsyn, wolma
Priority: low Keywords: easy, patch

Created on 2015-03-27 11:24 by vsinitsyn, last changed 2017-06-06 16:12 by Mariatta. This issue is now closed.

Files
File name Uploaded Description Edit
sum_doc.diff rhettinger, 2015-03-27 19:36 Change sequence to iterable
Pull Requests
URL Status Linked Edit
PR 1859 merged Mariatta, 2017-05-30 02:39
Messages (10)
msg239388 - (view) Author: Valentine Sinitsyn (vsinitsyn) Date: 2015-03-27 11:24
sum() function doctstring describes expected arguments as follows (Python 2.7.6):

sum(...)
    sum(sequence[, start]) -> value
...

This implies sum() should accept str, unicode, list, tuple, bytearray, buffer, and xrange. However, you clearly can't use this function to sum strings (which is also mentioned in the docstring):

>>> sum('abc')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I'd suggest to describe first argument as iterable, which is actually what sum() expects there.
msg239408 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-03-27 14:53
In python3 the docstring does say iterable.  It wouldn't be a bad thing to change it in 2.7, but it is not much of a priority.  iterable vs sequence makes no difference to the str question: a string is an iterable.  The docstring explicitly says strings are excepted, as you mentioned, so there's nothing to do about that.

I note that python3 also does not support iterables of byte-like objects.  I'm not sure if this would actually be helpful to add to the docstring, though, since sum(b'abc') works and a docstring is probably not an appropriate place to go into detail as to why.
msg239413 - (view) Author: Valentine Sinitsyn (vsinitsyn) Date: 2015-03-27 16:02
Yes, strings aren't an issue. I only used them as an example.

I came across this issue during code review, discussing if it is okay to pass generator expression to sum() (like sum(x*2 for x in xrange(5)) or is it better to convert it to the list first (sum([x*2 for x in xrange(5)])). Both variants work so docstring is sort of specification here.

Surely, it's not a high priority task anyways.
msg239468 - (view) Author: Wolfgang Maier (wolma) * Date: 2015-03-28 22:58
>This implies sum() should accept str, unicode, list, tuple, bytearray, buffer, and xrange.

and in fact it *does* accept all these as input. It just refuses to add the elements of the sequence if these elements are of certain types. Of course, the elements of a string are strings themselves so this does not work:

>>> sum('abc', '')
Traceback (most recent call last):
  File "<pyshell#88>", line 1, in <module>
    sum('abc', '')
TypeError: sum() can't sum strings [use ''.join(seq) instead]


compare with a bytes sequence in Python3, where the elements are ints:

>>> sum(b'abc', 0)
294


but strings are also perfectly accepatble as input if you do not try to add their str elements, but something else:

>>> class X (int):
    def __add__(self, other):
        return X(ord(other) + self)

>>> sum('abc', X(0))
294

=> the docs are right and there is no issue here.
msg239508 - (view) Author: Valentine Sinitsyn (vsinitsyn) Date: 2015-03-29 19:07
Seems like mentioning string was really a bad idea. They were only used as (poor) example, forget them if they are confusing in any way.

In my understanding, any sequence in Python is iterable, bit not all iterables are sequences (correct me if I'm wrong). Then, the purpose of my suggestion is to explicitly say that sum() accepts iterables. In its current form, it seems like it doesn't, that's why I considered the docstring [subtly] wrong.
msg291670 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-14 15:56
Raymond, could you open a pull request?
msg291767 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-04-16 22:02
[Serhiy]
> Raymond, could you open a pull request?

Perhaps you could do it for me.  I still haven't had time to wrestle with the github switchover, so I'm effectively crippled for a while.

[Valentine]
> Seems like mentioning string was really a bad idea .... that's
> why I considered the docstring [subtly] wrong.

Not really wrong in a way that confuses typical users.  That docstring has been successfully communicating the basic API for over a decade.

Over time, the docs have slowly converted the old "sequence" references to "iterable".  The docs were never really wrong; instead, we just got more precise by what we meant by sequence versus iterable (i.e. before the ABCs were introduced, the term "sequence" was used in a somewhat generic way to mean "a succession of data values"). 

Also note, it is an interesting paradox that docstrings that are the most helpful to most people most of the time are brief and little loose with terminology.  In general, they reward those who are doing quick lookups for API reminders, but do not reward pedantic close readings.

We'll go ahead and change "sequence" to "iterable" for sum(), but I think that is only a minor win.  The change makes it more technically correct but less friendly to some users (i.e. people need to be taught what "iterable" means while they tend to get the notion of "sequence of values" without any training).

As far as the exclusion of string goes, there were plenty of debate about whether to allow them or to more broadly disallow many data types where summing works quadratically.  The final decision was made by the BDFL and it seems to have been the right decision for just about everyone.  You can take issue with his decision, but that would be pointless.
msg291775 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-04-17 03:08
I believe this is just a 2.7 issue.
msg295273 - (view) Author: Mariatta Wijaya (Mariatta) * (Python committer) Date: 2017-06-06 16:12
New changeset 536209ef92f16ea8823209a3c4b8763c0ec5d4bc by Mariatta in branch '2.7':
bpo-23787: Change sum() docstring from sequence to iterable (GH-1859)
https://github.com/python/cpython/commit/536209ef92f16ea8823209a3c4b8763c0ec5d4bc
msg295274 - (view) Author: Mariatta Wijaya (Mariatta) * (Python committer) Date: 2017-06-06 16:12
Raymond's patch has been applied to 2.7 branch.
Thanks :)
History
Date User Action Args
2017-06-06 16:12:54Mariattasetstatus: open -> closed
resolution: fixed
messages: + msg295274

stage: patch review -> resolved
2017-06-06 16:12:05Mariattasetmessages: + msg295273
2017-05-30 02:39:33Mariattasetstage: patch review
2017-05-30 02:39:14Mariattasetpull_requests: + pull_request1942
2017-04-17 03:08:24rhettingersetmessages: + msg291775
versions: - Python 3.5, Python 3.6, Python 3.7
2017-04-16 23:39:06Mariattasetassignee: docs@python -> Mariatta

nosy: + Mariatta
versions: + Python 3.5, Python 3.6, Python 3.7
2017-04-16 22:02:12rhettingersetstatus: pending -> open
nosy: + rhettinger
messages: + msg291767

2017-04-14 15:56:11serhiy.storchakasetstatus: open -> pending
priority: normal -> low

nosy: + serhiy.storchaka
messages: + msg291670

keywords: + easy
2015-03-29 19:07:21vsinitsynsetmessages: + msg239508
2015-03-28 22:58:28wolmasetnosy: + wolma
messages: + msg239468
2015-03-27 19:36:03rhettingersetfiles: + sum_doc.diff
keywords: + patch
2015-03-27 16:02:47vsinitsynsetmessages: + msg239413
2015-03-27 14:53:25r.david.murraysetnosy: + r.david.murray
messages: + msg239408
2015-03-27 11:24:06vsinitsyncreate