This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ncoghlan
Recipients Aaron Hall, anthony shaw, methane, ncoghlan, pablogsal, ronaldoussoren, serhiy.storchaka
Date 2019-04-08.13:50:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1554731409.24.0.21762158139.issue36551@roundup.psfhosted.org>
In-reply-to
Content
I was going to note that the algorithm Anthony has pursued here is the same one we already use for the list constructor and list.extend(), but Inada-san already pointed that out :)

While length_hint is allowed to be somewhat inaccurate, we do expect it to be at least *vaguely* accurate (otherwise it isn't very useful, and if it can be inaccurate enough to trigger OverflowError or MemoryError in cases that would otherwise work reasonably well, it would be better for a type not to implement it at all).

While it would be nice to be able to avoid adding a new opcode, the problem is that the existing candidate opcodes (BUILD_LIST, BUILD_LIST_UNPACK) are both inflexible in what they do:

- BUILD_LIST requires that the final list length be known at compile time
- BUILD_LIST_UNPACK infers the final length from an object reference, but calls _PyList_Extend directly, so it combines the preallocation and the iterator consumption into a single step, and hence can't be used outside of iterable unpacking into a list

At the same time, attempting to generalise either of them isn't desirable, since it would slow them down for their existing use cases, and be slower than a new opcode for this use case.

The proposed BUILD_LIST_PREALLOC opcode splits the difference: it lets the compiler provide the interpreter with a *hint* as to how big the resulting list is expected to be.

That said, you'd want to run the result through the benchmark suite rather than relying solely on microbenchmarks, as even though unfiltered "[some_operation_on_x for x in y]" comprehensions without nested loops or filter clauses are pretty common (more common than the relatively new "[*itr]" syntax), it's far less clear what the typical distribution in input lengths actually is, and how many memory allocations need to be avoided in order to offset the cost of the initial _PyObject_LengthHint call (as Pablo's small scale results show).

(Note that in the _PyList_Extend code, there are preceding special cases for builtin lists and tuples that take those down a much faster path that avoids the _PyObject_LengthHint call entirely)
History
Date User Action Args
2019-04-08 13:50:09ncoghlansetrecipients: + ncoghlan, ronaldoussoren, methane, serhiy.storchaka, Aaron Hall, pablogsal, anthony shaw
2019-04-08 13:50:09ncoghlansetmessageid: <1554731409.24.0.21762158139.issue36551@roundup.psfhosted.org>
2019-04-08 13:50:09ncoghlanlinkissue36551 messages
2019-04-08 13:50:08ncoghlancreate