This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Unpacking of literals inside other literals should be optimised away by the compiler
Type: performance Stage: resolved
Components: Interpreter Core Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Mark.Shannon, pablogsal, pxeger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-12-27 07:56 by pxeger, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23979 closed pablogsal, 2020-12-28 21:30
Messages (5)
msg383837 - (view) Author: Patrick Reader (pxeger) * Date: 2020-12-27 07:56
When unpacking a collection or string literal inside another literal, the compiler should optimise the unpacking away and store the resultant collection simply as another constant tuple, so that `[*'123', '4', '5']` is the exact same as `['1', '2', '3', '4', '5']`.

Compare:

```
>>> dis.dis("[*'123', '4', '5']")
  1           0 BUILD_LIST               0
              2 BUILD_LIST               0
              4 LOAD_CONST               0 ('123')
              6 LIST_EXTEND              1
              8 LIST_EXTEND              1
             10 LOAD_CONST               1 ('4')
             12 LIST_APPEND              1
             14 LOAD_CONST               2 ('5')
             16 LIST_APPEND              1
```

vs.

```
>>> dis.dis("['1', '2', '3', '4', '5']")
  1           0 BUILD_LIST               0
              2 LOAD_CONST               0 (('1', '2', '3', '4', '5'))
              4 LIST_EXTEND              1
```

and `timeit` shows the latter to be over 3 times as fast.

For example, when generating a list of characters, it is easier and more readable to do `alphabet = [*"abcde"]` instead of `alphabet = ["a", "b", "c", "d", "e"]`. The programmer can do what is most obvious without worrying about performance, because the compiler can do it itself.
msg383912 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-12-28 18:34
We could possibly fold this at the AST optimizer, though I am not sure whether this worths anything as an optimization since it is a real obscure pattern. I've only found 2 occurrences (both from the test suite of black) on a relatively ~big dataset of python source code.
msg383932 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-28 21:36
I have added a draft PR on how the idea would look like so we can discuss with a specific proposal.
msg383934 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-28 21:50
I do not think there are serious raesons for adding this optimization. The Python compiler was intentionally made simple for maintainability. Constant folding supports only base arithmetic and bits operations because they are often used in constant expressions (like 2**32-1 or 1<<18) and indexing because of b'A'[0]. Neither comparisons, nor boolean operators with constants are optimized, because such expression are uncommon, and the maintaining cost overdraws benefit.

This is a similar case. I am -1 for this optimization.
msg383936 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-28 23:19
> This is a similar case. I am -1 for this optimization.

Yeah, I concur. One of the major drawbacks of this is that the normal case where this happens involves a Load():

z = "something"
x = ["a", *z, "b"]

I see almost no reason to nest the literals in this case, especially in a context sensitive scope. I am closing the draft PR and the issue.

Thanks Patrick for the proposal, though!
History
Date User Action Args
2022-04-11 14:59:39adminsetgithub: 86920
2020-12-28 23:19:28pablogsalsetstatus: open -> closed
resolution: rejected
messages: + msg383936

stage: patch review -> resolved
2020-12-28 21:50:00serhiy.storchakasetmessages: + msg383934
2020-12-28 21:36:34pablogsalsetmessages: + msg383932
2020-12-28 21:30:54pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request22824
2020-12-28 18:34:35BTaskayasetmessages: + msg383912
2020-12-28 18:33:57BTaskayasetmessages: - msg383911
2020-12-28 18:33:29BTaskayasetnosy: + Mark.Shannon, serhiy.storchaka, pablogsal
messages: + msg383911
2020-12-28 17:35:57BTaskayasetnosy: + BTaskaya
2020-12-27 07:56:46pxegersettype: performance
2020-12-27 07:56:34pxegercreate