Author Chad.Netzer
Recipients Chad.Netzer
Date 2021-03-21.02:31:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1616293914.75.0.599644741052.issue43574@roundup.psfhosted.org>
In-reply-to
Content
In Python v3.9+ there was a regression in the amount of used memory for
list-literals, due to switching to using list_extend() to allocate memory for
the new list to accomodate the literal elements.

Example, in Python v3.8.x (and before):
```
$ python38
Python 3.8.5 (default, Sep  4 2020, 02:22:02)
>>> [1].__sizeof__()
48
>>> [1,2].__sizeof__()
56
>>> [1,2,3].__sizeof__()
64
```

whereas for v3.9 (and later):
```
$ python39
Python 3.9.2 (default, Feb 19 2021, 17:09:53)
>>> [1].__sizeof__()
48
>>> [1,2].__sizeof__()
56
>>> [1,2,3].__sizeof__()
104  # a 60% increase in memory allocated
```

However, this seems like an unintented regression, and is a side-effect of the
new way of building the lists from literals, using the list_extend() function (via
list_resize(), which overallocates).  In particular, a consequence is that
making a copy of the list that's initialized from a literal can end up using
less memory:
```
$ python39
Python 3.9.2 (default, Feb 19 2021, 17:09:53)
>>> a = [1,2,3]
>>> b = list(a)  # Same behavior if list.copy() or slice copy is performed
>>> a.__sizeof__()
104
>>> b.__sizeof__()
64
```

Prior to v3.9, the byte-code for making a list from a literal had the
"BUILD_LIST" opcode with an explicit length argument, allowing allocation of
the exact amount of memory needed for the literal.  As of v3.9, the
LIST_EXTEND opcode is used, instead.  I believe the simplest way of restoring
the old behavior is to change list_extend() to not overallocate when the list
being extended currently has 0 elements.


Ie. a minimal-change patch to restore the previous behavior (though with a
side-effect of removing the overallocaton of a list that is initialzed empty,
and then immediately extended):

diff --git a/Objects/listobject.c b/Objects/listobject.c
index e7987a6d35..7820e033af 100644
--- a/Objects/listobject.c
+++ b/Objects/listobject.c
@@ -75,8 +75,9 @@ list_resize(PyListObject *self, Py_ssize_t newsize)
     if (newsize - Py_SIZE(self) > (Py_ssize_t)(new_allocated - newsize))
         new_allocated = ((size_t)newsize + 3) & ~(size_t)3;

-    if (newsize == 0)
-        new_allocated = 0;
+    /* Don't overallocate for lists that start empty or are set to empty. */
+    if (newsize == 0 || Py_SIZE(self) == 0)
+        new_allocated = newsize;
     num_allocated_bytes = new_allocated * sizeof(PyObject *);
     items = (PyObject **)PyMem_Realloc(self->ob_item, num_allocated_bytes);
     if (items == NULL) {



Relevant/related bugs/PRs:
# Switched to initializing list literals w/ LIST_EXTEND
https://bugs.python.org/issue39320
https://github.com/python/cpython/pull/17984

# Commit where over-allocation of list literals first appeared
https://bugs.python.org/issue38328
https://github.com/python/cpython/pull/17114
https://github.com/python/cpython/commit/6dd9b64770af8905bef293c81d541eaaf8d8df52

https://bugs.python.org/issue38373
https://github.com/python/cpython/pull/18952
https://github.com/python/cpython/commit/2fe815edd6778fb9deef8f8044848647659c2eb8
History
Date User Action Args
2021-03-21 02:31:54Chad.Netzersetrecipients: + Chad.Netzer
2021-03-21 02:31:54Chad.Netzersetmessageid: <1616293914.75.0.599644741052.issue43574@roundup.psfhosted.org>
2021-03-21 02:31:54Chad.Netzerlinkissue43574 messages
2021-03-21 02:31:54Chad.Netzercreate