This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author BTaskaya
Recipients BTaskaya, Mark.Shannon, pablogsal, serhiy.storchaka
Date 2021-06-23.18:13:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1624472022.52.0.0584361591595.issue44501@roundup.psfhosted.org>
In-reply-to
Content
It is a common scenario to make calls with only constant arguments (e.g to datetime.datetime/os.path.join/re.match.group/nox.session.run etc) and the bytecode that we currently generate looks like this;
f(1,2,3,4,5,6)
  1           0 LOAD_NAME                0 (f)
              2 LOAD_CONST               0 (1)
              4 LOAD_CONST               1 (2)
              6 LOAD_CONST               2 (3)
              8 LOAD_CONST               3 (4)
             10 LOAD_CONST               4 (5)
             12 LOAD_CONST               5 (6)
             14 CALL_FUNCTION            6
             16 POP_TOP
             18 LOAD_CONST               6 (None)
             20 RETURN_VALUE

But if we are sure that all arguments to a function is positional* (it is also possible to support keyword arguments to some extent, needs more research, but out of the scope for this particular optimization) and constant, then we could simply pack everything together and use CALL_FUNCTION_EX (we also need to set some limits, since when it is too little might prevent constant cache, and when it is too high might create giant tuples in the code object, perhaps 75 > N > 4)

  1           0 LOAD_NAME                0 (f)
              2 LOAD_CONST               0 ((1, 2, 3, 4, 5, 6))
              4 CALL_FUNCTION_EX         0
              6 POP_TOP
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE

The implementation is also very simple, and doesn't even touch anywhere beside the ast optimizer itself. It is possible to do this in the compiler, but that might complicate the logic so I'd say it is best to keep it as isolated as it can be.

(debug builds)

-s 'foo = lambda *args: None' 'foo("yyyyy", 123, 123321321312, (1,2,3), "yyyyy", 1.0, (1,2,3), "yyyyy", "yyyyy", (1,2,3), 5, 6, 7)'
Mean +- std dev: [master_artificial] 251 ns +- 2 ns -> [optimized_artificial] 185 ns +- 1 ns: 1.36x faster

-s 'from datetime import datetime' 'datetime(1997, 7, 27, 12, 10, 0, 0)'
Mean +- std dev: [master_datetime] 461 ns +- 1 ns -> [optimized_datetime] 386 ns +- 2 ns: 1.19x faster

One other potential candidate to this optimization is doing something similar in the CFG optimizer, and folding all contiguous LOAD_CONSTs (within some sort of limit ofc) into a single tuple load and then adding an UNPACK_SEQUENCE (which would replicate the effect). This is a poorer form, and I was only able to observe a speedup of 1.13x / 1.03x respectively on the benchmarks. The good thing about that optimization was that, first it was able to work with mixed parameters (so if you have some other types of expressions besides constants, but all constants follow each other, then it was able to optimize that case as well) and also it wasn't only for calls but rather all compiler cases where LOAD_CONST blocks were generated.
History
Date User Action Args
2021-06-23 18:13:42BTaskayasetrecipients: + BTaskaya, Mark.Shannon, serhiy.storchaka, pablogsal
2021-06-23 18:13:42BTaskayasetmessageid: <1624472022.52.0.0584361591595.issue44501@roundup.psfhosted.org>
2021-06-23 18:13:42BTaskayalinkissue44501 messages
2021-06-23 18:13:41BTaskayacreate