This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author anthony shaw
Recipients anthony shaw, ncoghlan
Date 2019-04-08.01:28:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1554686901.99.0.0374079193358.issue36551@roundup.psfhosted.org>
In-reply-to
Content
List comprehensions currently create a series of opcodes inside a code object, the first of which is BUILD_LIST with an oparg of 0, effectively creating a zero-length list with a preallocated size of 0.

If you're doing a simple list comprehension on an iterator, e.g.

def foo():
  a = iterable
  return [x for x in a]

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

The list comprehension will do a list_resize on the 4, 8, 16, 25, 35, 46, 58, 72, 88th iterations, etc.

This PR preallocates the list created in a list comprehension to the length of the iterator using PyObject_LengthHint().

It uses a new BUILD_LIST_PREALLOC opcode which builds a list with the allocated size of PyObject_LengthHint(co_varnames[oparg]).

[x for x in iterable] compiles to:

Disassembly of <code object <listcomp> at 0x109db2c40, file "<stdin>", line 3>:
  3           0 BUILD_LIST_PREALLOC      0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE

If the comprehension has ifs, then it will use the existing BUILD_LIST opcode

Testing using a range length of 10000

./python.exe -m timeit "x=list(range(10000)); [y for y in x]"

Gives 392us on the current 3.8 branch
and 372us with this change (about 8-10% faster)

the longer the iterable, the bigger the impact.

This change also catches the issue that a very large iterator, like a range object :
[a for a in range(2**256)]

Would cause the 3.8< interpreter to consume all memory and crash because there is no check against PY_SSIZE_MAX currently.

With this change (assuming there is no if inside the comprehension) is now caught and thrown as an OverflowError:

>>> [a for a in range(2**256)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <listcomp>
OverflowError: Python int too large to convert to C ssize_t
History
Date User Action Args
2019-04-08 01:28:22anthony shawsetrecipients: + anthony shaw, ncoghlan
2019-04-08 01:28:21anthony shawsetmessageid: <1554686901.99.0.0374079193358.issue36551@roundup.psfhosted.org>
2019-04-08 01:28:21anthony shawlinkissue36551 messages
2019-04-08 01:28:21anthony shawcreate