classification
Title: class body bytecode uses less efficient *_NAME opcodes
Type: performance Stage: needs patch
Components: Interpreter Core Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: brandtbucher, gregory.p.smith, serhiy.storchaka, twouters
Priority: normal Keywords:

Created on 2020-10-28 18:17 by gregory.p.smith, last changed 2020-10-28 22:05 by serhiy.storchaka.

Messages (5)
msg379839 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-10-28 18:17
The opcodes generated for the closure defining a class body looks like they might be suboptimal.  It seems stuck using the generic LOAD_NAME and STORE_NAME opcodes rather than the LOAD_GLOBAL and STORE_FAST and friends as one would expect and as happens within a normal function closure.

If true and the normal optimization pass (which I believe is done by `compiler.c` `compiler_nameop()` if i'm understanding this area of the code I don't know very well correctly...?) is not happening, it _appears_ maybe to be due to the `PyST_GetScope` call not currently having a way to represent this situation rather than being in a normal function?

I tried searching for an open issue on this but was at a loss of what to search for.  semi-related concepts might be found in:
 https://bugs.python.org/issue34750 - "locals().update doesn't work in Enum body, even though direct assignment to locals() does"
 https://bugs.python.org/issue10399 - "AST Optimization: inlining of function calls"
 https://bugs.python.org/issue9226 - "errors creating classes inside a closure"

None of those is really the same though.  if there are other filed issues regarding this, feel free to link to them in comments.

If this can be improved as it has been for function bodies already, it should be a measurable improvement to CPython startup time and/or import time.  Especially on larger programs and things with a lot of generated code full of classes.

```
>>> import dis
>>> klass_def = '''
... class Klassy:
...   pinky = 'brain'
... def Funktion(castle_argh):
...   __module__ = __name__
...   __qualname__ = 'not Klassy'
...   pinky = 'brain'
... '''
>>> dis.dis(compile(klass_def, '<Hi mom!>', 'exec'))
  2           0 LOAD_BUILD_CLASS
              2 LOAD_CONST               0 (<code object Klassy at 0x7f62f4f6e3a0, file "<Hi mom!>", line 2>)
              4 LOAD_CONST               1 ('Klassy')
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               1 ('Klassy')
             10 CALL_FUNCTION            2
             12 STORE_NAME               0 (Klassy)

  4          14 LOAD_CONST               2 (<code object Funktion at 0x7f62f4f7fa80, file "<Hi mom!>", line 4>)
             16 LOAD_CONST               3 ('Funktion')
             18 MAKE_FUNCTION            0
             20 STORE_NAME               1 (Funktion)
             22 LOAD_CONST               4 (None)
             24 RETURN_VALUE

Disassembly of <code object Klassy at 0x7f62f4f6e3a0, file "<Hi mom!>", line 2>:
  2           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 ('Klassy')
              6 STORE_NAME               2 (__qualname__)

  3           8 LOAD_CONST               1 ('brain')
             10 STORE_NAME               3 (pinky)
             12 LOAD_CONST               2 (None)
             14 RETURN_VALUE

Disassembly of <code object Funktion at 0x7f62f4f7fa80, file "<Hi mom!>", line 4>:
  5           0 LOAD_GLOBAL              0 (__name__)
              2 STORE_FAST               1 (__module__)

  6           4 LOAD_CONST               1 ('not Klassy')
              6 STORE_FAST               2 (__qualname__)

  7           8 LOAD_CONST               2 ('brain')
             10 STORE_FAST               3 (pinky)
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE
```
msg379840 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-28 19:58
Hm, I believe it is related to the reason why we need to use LOAD_CLASSDEREF instead of LOAD_DEREF with nonlocal names in class scope. If I understand correctly, the purpose is to keep nonlocal statements in methods from referencing class-level names.

From https://docs.python.org/3/reference/executionmodel.html#resolution-of-names:

> Class definition blocks and arguments to exec() and eval() are special in the context of name resolution. A class definition is an executable statement that may use and define names. These references follow the normal rules for name resolution with an exception that unbound local variables are looked up in the global namespace. The namespace of the class definition becomes the attribute dictionary of the class. The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods – this includes comprehensions and generator expressions since they are implemented using a function scope. This means that the following will fail:
> 
> class A:
>     a = 42
>     b = list(a + i for i in range(10))
msg379841 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-28 20:08
Actually, that doesn't make much sense in this context (more relevant would be a class-within-a-class or class-within-a-function).

I need to think about this more...
msg379842 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-28 20:35
In any case, I think the proposed change could break the current behavior:

>>> x = "global"
>>> class C:
...     x = "local"
...     l = x
...     del x
...     g = x
... 
>>> C.l
'local'
>>> C.g
'global'
msg379845 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-28 22:05
It cannot use LOAD_GLOBAL because the name can be not global. For example:

    def f(self): ...
    f = property(f)

It cannot use STORE_FAST and LOAD_FAST because they work with specific representation of locals as array, but in a class body it can be arbitrary mapping (returned by __prepare__()).

Also, executing bytecode takes only minor part of the class creation time, so optimization of LOAD_NAME and STORE_NAME is preliminary.
History
Date User Action Args
2020-10-28 22:05:11serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg379845
2020-10-28 20:35:09brandtbuchersetmessages: + msg379842
2020-10-28 20:08:48brandtbuchersetmessages: + msg379841
2020-10-28 19:58:26brandtbuchersetmessages: + msg379840
2020-10-28 19:17:39brandtbuchersetnosy: + brandtbucher
2020-10-28 18:17:34gregory.p.smithcreate