classification
Title: Sum() doc and behavior mismatch
Type: Stage: commit review
Components: Documentation Versions: Python 3.2, Python 3.1, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: belopolsky, docs@python, ezio.melotti, georg.brandl, lukasz.langa, lvogt, rhettinger, terry.reedy
Priority: low Keywords: patch

Created on 2009-12-06 00:50 by terry.reedy, last changed 2010-10-31 21:30 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
functions.rst.patch lvogt, 2010-05-18 23:35 documentation patch
functions.rst.2,patch terry.reedy, 2010-07-25 22:34 Alternate doc patch
functions.rst.3.patch terry.reedy, 2010-07-25 23:38
functions.rst.patch5.txt lvogt, 2010-07-29 18:11
Messages (21)
msg96013 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009-12-06 00:50
"sum(iterable[, start]) 
Sums start and the items of an iterable from left to right and returns
the total. start defaults to 0. The iterable‘s items are normally
numbers, and are not allowed to be strings."

The last sentence is not currently true (3.1, assume also others).
It is the start value that cannot be a string.

>>> sum([1,2],'')
TypeError: sum() can't sum strings [use ''.join(seq) instead]

>>>sum(['xyz', 'pdq'], Zero()) # R Hettinger's universal zero class
'xyzpdq'
works because start is not a string

>>> sum(['a','b'])
TypeError: unsupported operand type(s) for +: 'int' and 'str'
passes type(start) != str and only fails because + fails.

I am filing this as a doc issue as the easiest fix for the discrepancy
between doc and behavior. But the fix could be a behavior change, though
certain to be controversial. Given that, my  suggested revision is:
"The iterable‘s items are normally numbers. The start value is not
allowed to be a string."

I think this fits the followup sentence: "The fast, correct way to
concatenate a sequence of strings..."
msg96087 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-12-07 23:38
There are also a couple more things that could be improved in the
documentation of sum():
1) add a note about the "performance trap" mentioned by Alex [1]
2) remove the reduce() example because, even if it's true for that
particular example, it's not always true for the general case (in the
py3 doc it has been removed already).

See also [2] and the other messages in that thread.

[1]: http://mail.python.org/pipermail/python-dev/2003-October/039529.html
[2]: http://mail.python.org/pipermail/python-dev/2003-October/039586.html
msg106014 - (view) Author: Leonhard Vogt (lvogt) Date: 2010-05-18 23:35
I changed the documentation regarding string not allowed as start argument and performance

I included the list concatenation with itertools.chain from
http://groups.google.com/group/comp.lang.python/msg/33e764d0ac41826a

patch is based on revision 81300 in py3k branch.
msg111555 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-25 19:22
A nitpick:

"lol" is a very well-known acronym and "list of lists" is not the expansion that first comes to mind.
msg111561 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-25 22:01
Leonard, that you for the patch, in particular, the list of hints. I think the first is good but overly specific. It also has an extra '.itertools' in the path. Now:

'''
+      - To concatenate a list of lists ``lol`` use ``list(itertools.chain.from_iterable(lol))``
+        (see :func:`from_iterable <itertools.itertools.chain.from_iterable>`).
'''

How about instead"
'''
+      - To concatenate an iterable of iterables, such as a list of lists,
+        see :func:`from_iterable <itertools.chain.from_iterable>`
+        and call the appropriate constructor on the result, such as :class: `list`.
'''
?
msg111562 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-07-25 22:05
Note that using `~itertools.chain.from_iterable` is equivalent to `from_iterable <itertools.chain.from_iterable>` and saves a repetition.
msg111567 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-25 22:34
In order to have a chance at 2.6.6rc, due out in a week, I hand-edited the patch, incorporating Georg's suggestion, and uploaded. I think it is ready to commit.

Since this is not a critical doc fix, I am not marking it a blocker, but I still think it would be good to get it in.

Raymond, since you have not posted anything, I am assuming that your main interest was that the code not be changed, at least not without your notice. Since this is staying a doc issue, I am re-assigning.
msg111568 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-25 22:52
FWIW, I like the new patch better, but still have a few nitpicks:

- Starting a sentence with an argument name is a bit awkward because that makes a sentence that starts with a lower case letter.  

- There is an extra space in :class: `list`.

- It would make more boring prose, but I would prefer to see similar language used in each alternative.  I.e.

   - To concatenate ...
   - To concatenate ...
   - To add ...

- Why "an *iterable* of iterables" but "a *sequence* of strings"?

- I am not sure it is correct to say "to concatenate an iterable", I think it should be "to concatenate items from an iterable."
msg111576 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-25 23:38
patch3 incorporates all your suggestions.

I verified the itertools hint:
>>> list(itertools.chain.from_iterable([[10],[1,2],['a']]))
[10, 1, 2, 'a']

I think the hints are clear enough; anyone with more questions should follow the link to the respective function doc.
msg111579 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-07-25 23:44
I want to look at this more before it goes forward.
The docs are roughly correct as-is.  In a effort
to make them precise, it is easy make them more
confusing or at least a bit harder to understand
the basic function of sum().
msg111663 - (view) Author: Leonhard Vogt (lvogt) Date: 2010-07-26 19:48
Thank you.

I think the specific list of list example is better for the sum documentation because lists support the + operator. I don't think
that someone would consider using sum for chaining arbitrary iterables.
What about a concise "To concatenate lists use itertools.chain.from_iterable." and letting the "apply list constructor" part as an exercise to the reader?

I did the explicit reference to itertools.itertools.chain.from_iterable since the default or the explicit link to itertools.chain.from_iterable did not produce a correct link in the html docs.
I applied patch3 and ran make html; there is no html link to from_iterable produced.
msg111743 - (view) Author: Leonhard Vogt (lvogt) Date: 2010-07-27 22:35
another patch:
- moved string case to first position, i think it's the most important.
- reworded (shortened) list case.
- wrapped for <80 caracter lines.

still using itertools.itertools.chain.from_iterable as mentioned in previous message. I missed georgs use of ~ in the link, but i think its clearer to mention the full name of the function anyway.
msg111748 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-07-27 23:24
You could also use start=0 in the signature.
msg111835 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-28 18:35
I am puzzled that the fake name 'itertools.itertools.chain.from_iterable' works better than the real name 'itertools.chain.from_iterable'. Some bug in the tool chain?
msg111939 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-07-29 14:37
from_iterable() was marked up wrongly in the itertools docs; this is fixed now in r83230.
msg111982 - (view) Author: Leonhard Vogt (lvogt) Date: 2010-07-29 18:11
Thank you Georg, I updated the patch

Ezio, wouldn't  start=0  in the signature  imply that sum accepted a keyword argument? I read "Documenting Python" (4.3) but am not sure about the distinction of default values or keyword arbuments.

>>> sum([1,2,3], start=4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() takes no keyword arguments
msg111983 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2010-07-29 18:13
The oldest interfaces tend to have quirks like that. Similar issue: #9379. Won't be changed as well. I'm thinking of a better docstring though.
msg112004 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-29 21:00
For Python functions, all arguments can be keyword arguments. For many builtins, none can be (and maybe there should be a note to that effect at the top of the built-in functions chapter). 'Param = default' merely means that the parameter has a default argument and it is therefore optional to provide one. It says nothing about whether the argument can be identified by the parameter name or only by position.
msg112007 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2010-07-29 21:06
> For Python functions, all arguments can be keyword arguments. For many builtins, none can be (and maybe there should be a note to that effect at the top of the built-in functions chapter).

+1

Many Python users don't really grasp how builtins are different from regular functions. A note of this kind would be most helpful. Should I provide a patch for Doc/ ?
msg112009 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-07-29 21:12
Please do. This issue comes up often enough on python-list. Make it a separate doc issue, which will be auto-assigned to docs@python and add me as nosy.
msg120088 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-10-31 21:30
Fixed in r86066, r86068 and r86069.
History
Date User Action Args
2010-10-31 21:30:38rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg120088

versions: - Python 2.6
2010-07-29 21:12:37terry.reedysetmessages: + msg112009
2010-07-29 21:06:04lukasz.langasetmessages: + msg112007
2010-07-29 21:00:34terry.reedysetmessages: + msg112004
2010-07-29 18:13:23lukasz.langasetnosy: + lukasz.langa
messages: + msg111983
2010-07-29 18:11:14lvogtsetfiles: + functions.rst.patch5.txt

messages: + msg111982
2010-07-29 18:00:37lvogtsetfiles: - functions.rst.patch4.txt
2010-07-29 14:37:45georg.brandlsetmessages: + msg111939
2010-07-28 18:35:56terry.reedysetmessages: + msg111835
2010-07-27 23:24:15ezio.melottisetmessages: + msg111748
2010-07-27 22:35:51lvogtsetfiles: + functions.rst.patch4.txt

messages: + msg111743
2010-07-26 19:48:50lvogtsetmessages: + msg111663
2010-07-25 23:44:24rhettingersetpriority: normal -> low
assignee: docs@python -> rhettinger
messages: + msg111579
2010-07-25 23:38:57terry.reedysetfiles: + functions.rst.3.patch

messages: + msg111576
2010-07-25 22:52:57belopolskysetmessages: + msg111568
2010-07-25 22:34:13terry.reedysetfiles: + functions.rst.2,patch

nosy: + docs@python
messages: + msg111567

assignee: rhettinger -> docs@python
stage: patch review -> commit review
2010-07-25 22:05:54georg.brandlsetmessages: + msg111562
2010-07-25 22:01:57terry.reedysetmessages: + msg111561
2010-07-25 19:22:19belopolskysetnosy: + belopolsky
messages: + msg111555
2010-07-25 19:17:44belopolskysetstage: needs patch -> patch review
2010-05-18 23:35:17lvogtsetfiles: + functions.rst.patch

nosy: + lvogt
messages: + msg106014

keywords: + patch
2009-12-07 23:38:46ezio.melottisetpriority: normal

nosy: + ezio.melotti
messages: + msg96087

stage: needs patch
2009-12-06 08:47:37rhettingersetassignee: georg.brandl -> rhettinger

nosy: + rhettinger
2009-12-06 00:50:38terry.reedycreate