Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline list.append for the common case #91165

Closed
sweeneyde opened this issue Mar 14, 2022 · 8 comments
Closed

Streamline list.append for the common case #91165

sweeneyde opened this issue Mar 14, 2022 · 8 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@sweeneyde
Copy link
Member

BPO 47009
Nosy @tiran, @methane, @markshannon, @sweeneyde
PRs
  • bpo-47009: Streamline list.append for the common case #31864
  • bpo-47009: Let PRECALL_NO_KW_LIST_APPEND do its own POP_TOP #32239
  • bpo-47009: Fix assert on big endian (GH-32332) #32332
  • Files
  • _PyList_AppendTakeRef.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-04-06.03:19:51.278>
    created_at = <Date 2022-03-14.04:57:38.385>
    labels = ['interpreter-core', '3.11', 'performance']
    title = 'Streamline list.append for the common case'
    updated_at = <Date 2022-04-06.03:19:51.277>
    user = 'https://github.com/sweeneyde'

    bugs.python.org fields:

    activity = <Date 2022-04-06.03:19:51.277>
    actor = 'Dennis Sweeney'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-04-06.03:19:51.278>
    closer = 'Dennis Sweeney'
    components = ['Interpreter Core']
    creation = <Date 2022-03-14.04:57:38.385>
    creator = 'Dennis Sweeney'
    dependencies = []
    files = ['50674']
    hgrepos = []
    issue_num = 47009
    keywords = ['patch']
    message_count = 8.0
    messages = ['415116', '415121', '415127', '415327', '416481', '416766', '416786', '416841']
    nosy_count = 4.0
    nosy_names = ['christian.heimes', 'methane', 'Mark.Shannon', 'Dennis Sweeney']
    pr_nums = ['31864', '32239', '32332']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue47009'
    versions = ['Python 3.11']

    @sweeneyde
    Copy link
    Member Author

    list_resize is a long function that probably won't get inlined. But for the vast majority of cases in list.append, we just need to check whether the list is big enough (not whether it's small enough, or whether it's null or the wrong type), then insert and update the size. This can be inlined, with an actual call only taking place whenever we need to resize.

    We can also add a reference-consuming version of PyList_Append to elide an INCREF/DECREF pair.

    @sweeneyde sweeneyde added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Mar 14, 2022
    @methane
    Copy link
    Member

    methane commented Mar 14, 2022

    Hmm. Would you measure benefit from inlining and skipping incref/decref separately?

    If benefit of inlining is very small, making _PyList_AppendTakeRef() as regular internal API looks better to me.

    @sweeneyde
    Copy link
    Member Author

    The attached _PyList_AppendTakeRef.diff has the ceval.c, but this implementation:

    int
    _PyList_AppendTakeRef(PyListObject *self, PyObject *newitem)
    {
        assert(self != NULL && newitem != NULL);
        assert(PyList_Check(self));
        Py_ssize_t len = PyList_GET_SIZE(self);
        Py_ssize_t allocated = self->allocated;
        assert((size_t)len + 1 < PY_SSIZE_T_MAX);
        if (allocated > len) {
            PyList_SET_ITEM(self, len, newitem);
            Py_SET_SIZE(self, len + 1);
            return 0;
        }
        if (list_resize(self, len + 1) < 0) {
            Py_DECREF(newitem);
            return -1;
        }
        PyList_SET_ITEM(self, len, newitem);
        return 0;
    }

    Results:

    Benchmark main PR 31864 _PyList_AppendTakeRef.diff
    listcomp 100 1.61 us 1.33 us: 1.21x faster 1.55 us: 1.04x faster
    append 100 2.11 us 1.82 us: 1.15x faster 2.05 us: 1.03x faster
    listcomp 1000 12.6 us 9.83 us: 1.28x faster 11.9 us: 1.06x faster
    append 1000 18.1 us 15.3 us: 1.18x faster 17.6 us: 1.03x faster
    listcomp 10000 121 us 93.2 us: 1.29x faster 114 us: 1.06x faster
    append 10000 175 us 150 us: 1.17x faster 172 us: 1.02x faster
    listcomp 100000 1.17 ms 923 us: 1.26x faster 1.15 ms: 1.02x faster
    append 100000 1.70 ms 1.49 ms: 1.14x faster not significant
    Geometric mean (ref) 1.21x faster 1.03x faster

    @methane
    Copy link
    Member

    methane commented Mar 16, 2022

    Thank you. I agree that inlining is worth enough.

    But we already inlined too many functions in ceval and there is an issue caused by it... (bpo-45116)

    @markshannon
    Copy link
    Member

    New changeset a0ea7a1 by Dennis Sweeney in branch 'main':
    bpo-47009: Streamline list.append for the common case (GH-31864)
    a0ea7a1

    @markshannon
    Copy link
    Member

    New changeset 6c6e040 by Dennis Sweeney in branch 'main':
    bpo-47009: Let PRECALL_NO_KW_LIST_APPEND do its own POP_TOP (GH-32239)
    6c6e040

    @tiran
    Copy link
    Member

    tiran commented Apr 5, 2022

    New changeset 9e88b57 by Christian Heimes in branch 'main':
    bpo-47009: Fix assert on big endian (GH-32332)
    9e88b57

    @sweeneyde
    Copy link
    Member Author

    Buildbots are passing, so I'm closing this. Thanks for the catch and fix!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants