Title: Optimize concatenating empty tuples
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2017-03-06 19:48 by serhiy.storchaka, last changed 2017-03-25 10:14 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 524 merged serhiy.storchaka, 2017-03-06 19:53
Messages (6)
msg289129 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-06 19:48
Since tuples are immutable, concatenating with empty tuple can be optimized by returning an opposite argument.

Microbenchmarks (the difference is larger for larger tuples):

$ ./python -m perf timeit --duplicate=100 -s 'a = (1, 2)' 'a + ()'
Unpatched:  Median +- std dev: 288 ns +- 12 ns
Patched:    Median +- std dev: 128 ns +- 5 ns

$ ./python -m perf timeit --duplicate=100 -s 'a = (1, 2)' '() + a'
Unpatched:  Median +- std dev: 285 ns +- 16 ns
Patched:    Median +- std dev: 128 ns +- 6 ns

Non-empty tuples are not affected:

$ ./python -m perf timeit --duplicate=100 -s 'a = (1, 2)' 'a + a'
Unpatched:  Median +- std dev: 321 ns +- 24 ns
Patched:    Median +- std dev: 317 ns +- 26 ns
msg289149 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-07 01:48
Is it at all common to concatenate empty tuples?  This seems like optimizing something that rarely occurs.

Also, the timings can be misleading if in real code these changes cause new branch mispredictions here which can slow the common case.  See
msg289150 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-07 05:39
I agree with your comments Raymond, but let me to expose my reasons.

An idea of optimizing empty tuple concatenating is inspired by partial(). It concatenates two tuples of arguments and it is common when one of tuple is empty. Current C implementation makes special case for this (and does this suboptimal). With optimized empty tuple concatenating it could be simpler and more efficient. Even when C implementation of partial() will not use this feature (see issue29735), I hope it can help in Python implementation of partial() and other partial-like functions.
msg290283 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:46
New changeset 98e80c2babac0003182c3482c6c5437ea111e795 by Serhiy Storchaka in branch 'master':
bpo-29737: Optimize concatenating with empty tuple. (#524)
msg290466 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-25 06:08
The PR already was merged when you wrote your comment Raymond. If you insist I can revert it. But 0f7b0b397e12514ee213bc727c9939b66585cbe2 depends on it.
msg290471 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-25 10:14
I like the change. It prevents to duplicate tuples and so reduce the memory
footprint. PGO compilation helps branch prediction anyway.
Date User Action Args
2017-03-25 10:14:50vstinnersetmessages: + msg290471
2017-03-25 06:08:35serhiy.storchakasetmessages: + msg290466
2017-03-25 05:53:51serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-03-24 22:46:55serhiy.storchakasetmessages: + msg290283
2017-03-07 05:39:58serhiy.storchakasetmessages: + msg289150
2017-03-07 01:48:19rhettingersetnosy: + rhettinger
messages: + msg289149
2017-03-06 19:53:00serhiy.storchakasetpull_requests: + pull_request432
2017-03-06 19:48:36serhiy.storchakacreate