classification
Title: Regression in overallocation for literal list initialization in v3.9+
Type: resource usage Stage: patch review
Components: Interpreter Core Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Chad.Netzer, brandtbucher, corona10, gregory.p.smith, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-03-21 02:31 by Chad.Netzer, last changed 2021-04-11 21:39 by gregory.p.smith.

Pull Requests
URL Status Linked Edit
PR 24954 open Chad.Netzer, 2021-03-21 03:19
Messages (2)
msg389207 - (view) Author: Chad Netzer (Chad.Netzer) * Date: 2021-03-21 02:31
In Python v3.9+ there was a regression in the amount of used memory for
list-literals, due to switching to using list_extend() to allocate memory for
the new list to accomodate the literal elements.

Example, in Python v3.8.x (and before):
```
$ python38
Python 3.8.5 (default, Sep  4 2020, 02:22:02)
>>> [1].__sizeof__()
48
>>> [1,2].__sizeof__()
56
>>> [1,2,3].__sizeof__()
64
```

whereas for v3.9 (and later):
```
$ python39
Python 3.9.2 (default, Feb 19 2021, 17:09:53)
>>> [1].__sizeof__()
48
>>> [1,2].__sizeof__()
56
>>> [1,2,3].__sizeof__()
104  # a 60% increase in memory allocated
```

However, this seems like an unintented regression, and is a side-effect of the
new way of building the lists from literals, using the list_extend() function (via
list_resize(), which overallocates).  In particular, a consequence is that
making a copy of the list that's initialized from a literal can end up using
less memory:
```
$ python39
Python 3.9.2 (default, Feb 19 2021, 17:09:53)
>>> a = [1,2,3]
>>> b = list(a)  # Same behavior if list.copy() or slice copy is performed
>>> a.__sizeof__()
104
>>> b.__sizeof__()
64
```

Prior to v3.9, the byte-code for making a list from a literal had the
"BUILD_LIST" opcode with an explicit length argument, allowing allocation of
the exact amount of memory needed for the literal.  As of v3.9, the
LIST_EXTEND opcode is used, instead.  I believe the simplest way of restoring
the old behavior is to change list_extend() to not overallocate when the list
being extended currently has 0 elements.


Ie. a minimal-change patch to restore the previous behavior (though with a
side-effect of removing the overallocaton of a list that is initialzed empty,
and then immediately extended):

diff --git a/Objects/listobject.c b/Objects/listobject.c
index e7987a6d35..7820e033af 100644
--- a/Objects/listobject.c
+++ b/Objects/listobject.c
@@ -75,8 +75,9 @@ list_resize(PyListObject *self, Py_ssize_t newsize)
     if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize))
         new_allocated = ((size_t)newsize + 3) & ~(size_t)3;

-    if (newsize == 0)
-        new_allocated = 0;
+    /* Don't overallocate for lists that start empty or are set to empty. */
+    if (newsize == 0 || Py_SIZE(self) == 0)
+        new_allocated = newsize;
     num_allocated_bytes = new_allocated * sizeof(PyObject *);
     items = (PyObject **)PyMem_Realloc(self->ob_item, num_allocated_bytes);
     if (items == NULL) {



Relevant/related bugs/PRs:
# Switched to initializing list literals w/ LIST_EXTEND
https://bugs.python.org/issue39320
https://github.com/python/cpython/pull/17984

# Commit where over-allocation of list literals first appeared
https://bugs.python.org/issue38328
https://github.com/python/cpython/pull/17114
https://github.com/python/cpython/commit/6dd9b64770af8905bef293c81d541eaaf8d8df52

https://bugs.python.org/issue38373
https://github.com/python/cpython/pull/18952
https://github.com/python/cpython/commit/2fe815edd6778fb9deef8f8044848647659c2eb8
msg390174 - (view) Author: Chad Netzer (Chad.Netzer) * Date: 2021-04-04 04:44
For bpo-43574, my initial plan was to change list_resize() so that it wouldn't
overallocate empty lists that were resized to be bigger, thus restoring the
overallocation behavior for list-literals to be like that of earlier Python
releases.

However, the list_resize() helper is also used by list appends and inserts, as
well as other list methods that change list sizes.  This proposed strategy of
not-overallocating any resize that starts with an empty list also affects
their behavior, and with the list-overallocation calculation that was
introduced with bpo-38373, this causes unexpected side-effects on the amount
of overallocation used for appending/inserting with an empty list.

Therefore, my updated proposal handles this by changing the overallocation
behavior in list_resize() only when empty lists are increased by an amount
greater than 1 during list_resize().  This basically retains the behavior of
not overallocating list-literals, while not changing the behavior for list
append/insert on an empty list.

To understand why this updated proposed change won't also cause overallocation
for length-1 literals, see below how length-1 and length-2 list-literals are
compiled using BUILD_LIST instead of LIST_EXTEND:

```
Python 3.9.2 (default, Mar 26 2021, 23:27:12)
>>> import dis
>>> def foo():
...   [1]
...
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (1)
              2 BUILD_LIST               1
              4 POP_TOP
              6 LOAD_CONST               0 (None)
              8 RETURN_VALUE
>>> def bar():
...   [1,2]
...
>>> dis.dis(bar)
  2           0 LOAD_CONST               1 (1)
              2 LOAD_CONST               2 (2)
              4 BUILD_LIST               2
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> def baz():
...   [1,2,3]
...
>>> dis.dis(baz)
  2           0 BUILD_LIST               0
              2 LOAD_CONST               1 ((1, 2, 3))
              4 LIST_EXTEND              1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
```


Hence, the change to list_resize(), which is called by list_extend() but not
the BUILD_LIST opcode, won't impact list-literals of length 1 and 2.


And to show how the originally proposed change (no overallocation for
list_resize() of empty lists) inadvertently changed how list append/insert
overallocation worked, let's first take a look at how current Python (3.9+)
works:

```
>>> l = []
>>> l.__sizeof__()
40
>>> l.append("a")  # First append will overallocate to capacity 4
>>> l.__sizeof__()
72
>>> l.append("b")
>>> l.__sizeof__()
72
>>> l.append("c")
>>> l.__sizeof__()
72
>>> l.append("d")
>>> l.__sizeof__()
72
>>> l.append("e")
>>> l.__sizeof__()
104
>>> l2 = ["a"]  # Length 1 (and 2) literals don't overallocate
>>> l2.__sizeof__()
48
>>> # However, note that the first append will overallocate to capacity 8
>>> l2.append("b")
>>> l2.__sizeof__()
104
```

Note that the list-literal of length 1 isn't overallocated, and that
appending to it skips the capacity 4 list and goes straight to capacity 8.

However, with my originally proposed change, this overallocation behavior is
duplicated even for lists that start empty, because the first append then
effectively becomes a non-overallocated list of length and capacity 1, which
means that the second append overallocates to capacity 8.

Here was the overallocation behavior of my first proposal, when an empty list
was appended:
```
Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals:7436223a71, Apr  3 2021, 16:32:22) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
>>> l = []
>>> l.__sizeof__()
40
>>> l.append("a")
>>> l.__sizeof__()  # No overallocation on first append
48
>>> l.append("b")
>>> l.__sizeof__()  # Second append jumps to capacity 8, skipping 4
104
```

This is unexpected and likely undesirable behavior, as lists being created
empty and then appended to should be expected to follow the same documented 4,
8, 16, 24, ...  growth pattern for bpo-38373.

Fixing this is fairly simple because of the special-casing for length 1 and 2
list-literals, just by checking for the use of append/insert on an empty list.
Because they both change the size of a length 0 list to length 1, and the
list_resize() for literals always kicks in for changing to lengths 3 or more,
this updates proposal will retain the current empty-list insert/append
overallocation behavior, while still allowing list-literals of length 1 or
more to not overallocate.

With this updated proposal, avoiding a change to the behavior of an empty list
append/insert, the overallocation for list-literals is still removed, and the
current overallocation behavior for empty lists being appended is preserved:
```
Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals-v2:56b361221d, Apr  3 2021, 17:35) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
>>> l = []
>>> l.__sizeof__()
40
>>> l.append("a")
>>> l.__sizeof__()
72
>>> l.append("b")
>>> l.__sizeof__()
72
>>> l2 = ["a"]
>>> l2.__sizeof__()
48
>>> l2 = ["a", "b", "c"] # list_extend() literal setup case starts w/ 3 elements
>>> l2.__sizeof__()
64
```

So, with this variant on my original proposal, using the current compilation
behavior of setting up list-literals of length 3 or greater using
list_extend(), we regain the previous Python behavior of not overallocating
list-literals, and we also retain the current overallocation behavior for
empty lists that are appended to.

Just to document it, here are the functions in Objects/listobject.c that are
impacted by a change in list_resize(), and the proposed bpo-43574 change in
particular:

ins1()  # insert 1 element in list, including empty list.
app1()  # append 1 element to list.  Same impact as with ins1()
list_ass_slice()  # can assign a sequence to a list slice, expanding list
list_inplace_repeat()  # No impact, early exit for len == 0
list_extend()  # Now used for list-literals of len > 2
list_pop() # No impact, same behavior for list becoming len == 0
list_ass_subscript()  # No impact, list_resize() only called for deletion

The list_ass_slice() case is changed by this update, for an empty list, as
overallocation will not occur on the first list_extend() with more than 1
element (ie. just like list-literals, the overallocation will be delayed
until a future list_resize()).  Here's an example of the change, first for
current Python 3.9+ behavior:
```

Python 3.9.2 (default, Mar 26 2021, 23:27:12)
>>> l = []
>>> l.__sizeof__()
40
>>> l[:] = ["a"]
>>> l.__sizeof__()
72
>>> l = []
>>> l[:] = ["a", "b"]
>>> l.__sizeof__()
104
>>> l.append("c")
>>> l.__sizeof__()
104
```
and the behavior with my revised proposal for non-overallocation of
list-literals:
```
Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals-v2:56b361221d, Apr  3 2021, 17:35) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
>>> l = []
>>> l.__sizeof__()
40
>>> l[:] = ["a"]
>>> l.__sizeof__()
72
>>> l = []
>>> l[:] = ["a", "b"]
>>> l.__sizeof__()
56
>>> l.append("c")
>>> l.__sizeof__()
104
```
History
Date User Action Args
2021-04-11 21:39:01gregory.p.smithsetnosy: + gregory.p.smith
2021-04-06 13:56:14vstinnersetnosy: - vstinner
2021-04-04 04:44:41Chad.Netzersetmessages: + msg390174
2021-03-21 05:58:10brandtbuchersetnosy: + brandtbucher
2021-03-21 03:19:58Chad.Netzersetkeywords: + patch
stage: patch review
pull_requests: + pull_request23712
2021-03-21 02:33:45corona10setnosy: + corona10
2021-03-21 02:33:40corona10setnosy: + vstinner, serhiy.storchaka
2021-03-21 02:31:54Chad.Netzercreate