Author Sergey
Recipients Sergey, jcea, serhiy.storchaka, terry.reedy
Date 2013-06-28.23:52:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1372463541.02.0.559209336606.issue18305@psf.upfronthosting.co.za>
In-reply-to
Content
> The issue of quadratic performance of sum(sequences, null_seq) is known

I hope it's never too late to fix some bugs... :)

> sum([[1,2,3]]*n, []) == [1,2,3]*n == list(chain.from_iterable([[1,2,3]]*n))

But if you already have a list of lists, and you need to join the lists together you have only two of those:
1. sum(list_of_lists, [])
2. list(chain.from_iterable(list_of_lists))
And using sum is much more obvious than using itertools, that most people may not (and don't have to) even know about.

When someone, not a python-guru, just thinks about that, she would think "so, I'll just add lists together, let's write a for-loop... Oh, wait, that's what sum() does, it adds things, and python is dynamic-type, sum() should work for everything". That's how I was thinking, that's how most people would think, I guess...

I was very surprised to find out about that bug.

> 1. People *will* move code that depends on the internal optimization to pythons that do not have it.

Looks like this bug is CPython-specific, others (Jython, IronPython...) don't have it, so people will move code that depends on the internal optimization to other pythons that DO have it. :)

> 2. It discourages people from carefully thinking about whether they actually need a concrete list or merely the iterator for a virtual list.

Hm... Currently people can also use iterator for sum() or list for itertools. Nothing changed...

> I agree with Terry. CPython deliberately disallow use sum() with lists of strings.

Isn't it exactly because of this bug? I mean, if this bug gets fixed, sum would be as fast as join, or maybe even faster, right? So the string restriction can be dropped later. But that would be a separate bugreport. Anyway, the bug is there not just for strings, it also happens for lists, or for any other non-numeric objects that can be added.

PS: I was ready that my patch may not get accepted, and I'm actually thinking on another way of doing that (just don't know how to get a copy of arbitrary PyObject in C yet). But I thought that the idea itself is great: finally making sum() fast without any trade-offs, what could be better? Patch works at least for 2.7, 3.3, hg-tip and can be easily ported to any other version. I have not expected to get such a cold shoulder. :(
History
Date User Action Args
2013-06-28 23:52:21Sergeysetrecipients: + Sergey, terry.reedy, jcea, serhiy.storchaka
2013-06-28 23:52:21Sergeysetmessageid: <1372463541.02.0.559209336606.issue18305@psf.upfronthosting.co.za>
2013-06-28 23:52:20Sergeylinkissue18305 messages
2013-06-28 23:52:20Sergeycreate