classification
Title: Optimize function annotation
Type: resource usage Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, corona10, jstasiak, lukasz.langa, methane, serhiy.storchaka, uriyyo
Priority: normal Keywords: patch

Created on 2020-10-30 06:47 by methane, last changed 2020-11-25 10:44 by methane. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23316 merged uriyyo, 2020-11-16 16:26
Messages (15)
msg379923 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-10-30 06:47
Look this example:

code:

```
# code
def foo(x: int, /, y, *, z: float) -> Hoge:
    pass

# dis
 2          12 LOAD_CONST               2 ('int')
            14 LOAD_CONST               3 ('float')
            16 LOAD_CONST               4 ('Hoge')
            18 LOAD_CONST               5 (('x', 'z', 'return'))
            20 BUILD_CONST_KEY_MAP      3
            22 LOAD_CONST               6 (<code object foo at ...>)
            24 LOAD_CONST               7 ('foo')
            26 MAKE_FUNCTION            4 (annotations)
            28 STORE_NAME               2 (foo) 
```

Four `LOAD_CONST` and `BUILD_CONST_KEY_MAP` are used to generate annotation dict. This makes program load slow and eat more memory.

Annotation information can be stored in some compact form. And creating annotation dict can be postponed to when `func.__annotation__` is accessed.

Ideas for the compact form:

1. Tuple.
   In above example, `('int', None, 'float', 'Hoge')` can be used. None means no annotation for the 'y' parameter.

2. Serialize into str or bytes.
   JSON like format can be used, like `x:int,z:float;Hoge`. Compact. But the string/bytes has lower chance to be shared with other constants in same module.
msg379928 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-30 07:39
Are annotations now always known at compile time?

As for representation, it can also be a sequence of pairs (('x', 'int'), ('z', 'float'), ('return', 'Hoge')) or a pair of sequences (('x', 'z', 'return'), ('int', 'float', 'Hoge')). It would be better to save a dict directly in pyc files, but it needs changing the marshal protocol.

Also, it makes sense to make annotations attribute of the code object, so avoid the overhead at function creation time.

I have a dream to split the pyc file into several files or sections and save docstrings and annotations (and maybe line numbers) separately from the main code. They should be loaded by demand, when you read __doc__ or __annotation__. Most code does not use them at run time, so we can save memory and loading time. It can also help with internationalization.
msg379929 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-10-30 07:40
I like the 1st option which uses a tuple
msg379930 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-10-30 07:43
@serhiy race condition sorry ;)
msg379936 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-10-30 08:32
> Are annotations now always known at compile time?

Yes, because `from __future__ import annotations` is enabled by default from Python 3.10.

> As for representation, it can also be a sequence of pairs (('x', 'int'), ('z', 'float'), ('return', 'Hoge')) or a pair of sequences (('x', 'z', 'return'), ('int', 'float', 'Hoge')). It would be better to save a dict directly in pyc files, but it needs changing the marshal protocol.

Yes, but it is bit larger than my single tuple idea in most cases.
Since most annotations are not used by runtime, we don't need to create a dict until `func.__annotation__` is read.

> Also, it makes sense to make annotations attribute of the code object, so avoid the overhead at function creation time.

I am not sure this is the best option because there are many code object without annotation.

> I have a dream to split the pyc file into several files or sections and save docstrings and annotations (and maybe line numbers) separately from the main code. They should be loaded by demand, when you read __doc__ or __annotation__. Most code does not use them at run time, so we can save memory and loading time. It can also help with internationalization.

I have same dream.
msg379947 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-30 11:25
> Yes, but it is bit larger than my single tuple idea in most cases.

Yes, but the code for creating a dict can be simpler. In any case we will better see what format is better when try to write a code.

> I am not sure this is the best option because there are many code object without annotation.

In this case it can be None or NULL.

I like your idea. It is easy to implement it now. Later we can make annotations an attribute of the code object.
msg381200 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-11-17 01:17
> Yes, but the code for creating a dict can be simpler. In any case we will better see what format is better when try to write a code.

Note that many annotations are not accessed. RAM usage of annotation information is important than how easy to create dict.

I don't like `(('x', 'int'), ('z', 'float'), ('return', 'Hoge'))` because it creates 4 tuples. It means use more memory, load pyc slower.

Please use ('x', 'int', 'z', 'float', 'return', 'Hoge') instead.
msg381251 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-11-17 15:43
For top level functions (functions created once) this isn't going to make any real difference. There might be a small speedup for function creation, but it isn't going to be measurable.

For nested functions with annotations, where many functions are created from a single code object, this could be worthwhile.

However, before we add yet another attribute to code objects, I'd like to see some evidence of a speedup.
msg381253 - (view) Author: Yurii Karabas (uriyyo) * Date: 2020-11-17 16:04
I have just implemented `co_annotations` field for `CodeObject`.
I wrote a simple benchmark to measure the time required to import black module (I took black because it contains a log of annotations).

Benchmark simply run `python -m timeit -n 5000000 "import black"`.

Results:
```
Python 3.6.8
5000000 loops, best of 3: 0.0983 usec per loop
Python 3.7.6
5000000 loops, best of 5: 102 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 97.4 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 99.5 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 92.4 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 98.9 nsec per loop
```
msg381255 - (view) Author: Jakub Stasiak (jstasiak) * Date: 2020-11-17 16:47
Yurii, I don't believe that benchmark measures what you need to measure (once imported module is kept imported forever until unloaded, so successive imports are no-ops).

See how the side effects of importing bbb only happen once: 

% cat bbb.py 
import time
time.sleep(1)
with open('bbb.log', 'a') as f:
    written = f.write('hello\n')
    assert written == 6

% time python -m timeit "import bbb"
1 loop, best of 5: 515 nsec per loop
python -m timeit "import bbb"  0.03s user 0.01s system 4% cpu 1.050 total

% cat bbb.log 
hello
msg381264 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-11-17 17:56
If you want to measure import time, use

  python -m timeit -s "from sys import modules; modules_copy = modules.copy()" "import black; modules.clear(); modules.update(modules_copy)"

But I would be surprised to see significant difference in this case.

What Mark means, measure the time of creation of nested function.

  python -m timeit "def f(a: int, b: str) -> None: pass"

And maybe test with different number of arguments if there is a difference.
msg381272 - (view) Author: Yurii Karabas (uriyyo) * Date: 2020-11-17 18:51
I have run tests with different types of function declaration.


A function declaration with annotations is more than 2 times faster with the co_annotatins feature.

If function doesn't have annotations time almost same as without co_annotatins feature.

Results:
```
def foo(x: int, /, y, *, z: float) -> int: pass

Python 3.8.3
5000000 loops, best of 5: 178 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 210 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 122 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 53.3 nsec per loop

def f(a: int, /, b: int, *, c: int) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 208 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 235 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 139 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 53.2 nsec per loop

def f(a: int, /, b: int, *, c: int, **d: int) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 224 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 257 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 167 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 55.9 nsec per loop

def f(a: int, b: str) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.163 usec per loop
Python 3.7.6
5000000 loops, best of 5: 165 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 165 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 184 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 125 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 54.5 nsec per loop

def f(a: int, *, b: int) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.166 usec per loop
Python 3.7.6
5000000 loops, best of 5: 170 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 155 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 198 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 124 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 54.3 nsec per loop

def f(a, /, b, *, c) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 90.1 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 96.3 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 93.8 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 55.5 nsec per loop

def f(a, /, b, *, c, **d) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 92.3 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 98 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 92.6 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 54.4 nsec per loop

def f(a, b) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.0966 usec per loop
Python 3.7.6
5000000 loops, best of 5: 92.5 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 87.5 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 93.7 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 88.3 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 53 nsec per loop

def f(a, *, b) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.0951 usec per loop
Python 3.7.6
5000000 loops, best of 5: 92.4 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 86.6 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 93.6 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 89.8 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 53.6 nsec per loop

def f(): pass

Python 3.6.8
5000000 loops, best of 3: 0.0502 usec per loop
Python 3.7.6
5000000 loops, best of 5: 47.7 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 47.9 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 46.7 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 50.8 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 52 nsec per loop

def f(a, /, b, *, c): pass

Python 3.8.3
5000000 loops, best of 5: 47.9 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 47.4 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 50.2 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 52.8 nsec per loop

def f(a, /, b, *, c, **d): pass

Python 3.8.3
5000000 loops, best of 5: 48.7 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 48.2 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 50.8 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 52.4 nsec per loop

def f(a, b): pass

Python 3.6.8
5000000 loops, best of 3: 0.0498 usec per loop
Python 3.7.6
5000000 loops, best of 5: 48.5 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 47.5 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 47 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 51 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 52.6 nsec per loop

def f(a, *, b): pass

Python 3.6.8
5000000 loops, best of 3: 0.0498 usec per loop
Python 3.7.6
5000000 loops, best of 5: 48.1 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 48.4 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 46.6 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 50.2 nsec per loop
Python 3.10.0a2+ with co_annotations
5000000 loops, best of 5: 52.6 nsec per loop
```
msg381309 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-11-18 00:50
I don't like co_annotations.

* It changes PyCode_NewXXX() API.

* Many functions don't have annotations. Adding annotation to code object makes code object fatter even if the function doesn't have annotation.

* Code object is immutable & hashable. Adding annotation to code object makes == and hash() complex.

* We may introduce lazy loading for docstring and annotation in the future.


func.__annotations__ =  ('x', 'int', 'z', 'float', 'return', 'Hoge') is much better because:

* Zero overhead for functions without any annotations.
* After annotation dict is created, the tuple can be released.
msg381320 - (view) Author: Yurii Karabas (uriyyo) * Date: 2020-11-18 09:36
> func.__annotations__ =  ('x', 'int', 'z', 'float', 'return', 'Hoge') is much better because:

Inada, I totally agree with you. Sorry, I didn't realize all pitfalls with extra field to codeobject.

New implementation with annotations representation as a single tuple doesn't require a lot to change to the existing codebase. And I have already done it.

I rerun all benchmarks and there is no performance degradation in a case when the function doesn't have annotations and it's more than 2 times faster when the function has annotations.

Benchmark results:
```
def f(x: int, /, y, *, z: float) -> int: pass

Python 3.8.3
5000000 loops, best of 5: 209 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 232 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 138 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 56.1 nsec per loop

def f(a: int, /, b: int, *, c: int) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 241 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 274 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 158 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 58.8 nsec per loop

def f(a: int, /, b: int, *, c: int, **d: int) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 256 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 326 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 264 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 87.1 nsec per loop

def f(a: int, b: str) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.215 usec per loop
Python 3.7.6
5000000 loops, best of 5: 201 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 204 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 204 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 137 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 55.8 nsec per loop

def f(a: int, *, b: int) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.186 usec per loop
Python 3.7.6
5000000 loops, best of 5: 181 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 166 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 189 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 138 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 64.7 nsec per loop

def f(a, /, b, *, c) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 96 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 102 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 98.7 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 57.4 nsec per loop

def f(a, /, b, *, c, **d) -> None: pass

Python 3.8.3
5000000 loops, best of 5: 97.8 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 105 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 96.8 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 58.3 nsec per loop

def f(a, b) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.107 usec per loop
Python 3.7.6
5000000 loops, best of 5: 99.7 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 97.5 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 103 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 100 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 57.5 nsec per loop

def f(a, *, b) -> None: pass

Python 3.6.8
5000000 loops, best of 3: 0.105 usec per loop
Python 3.7.6
5000000 loops, best of 5: 99.4 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 95.5 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 103 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 94.9 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 59.2 nsec per loop

def f(): pass

Python 3.6.8
5000000 loops, best of 3: 0.0542 usec per loop
Python 3.7.6
5000000 loops, best of 5: 51.2 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 52.3 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 52.1 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 60.8 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 59.8 nsec per loop

def f(a, /, b, *, c): pass

Python 3.8.3
5000000 loops, best of 5: 56.1 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 59.8 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 64 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 60.6 nsec per loop

def f(a, /, b, *, c, **d): pass

Python 3.8.3
5000000 loops, best of 5: 53.6 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 50.7 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 54.1 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 53.9 nsec per loop

def f(a, b): pass

Python 3.6.8
5000000 loops, best of 3: 0.054 usec per loop
Python 3.7.6
5000000 loops, best of 5: 53.9 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 54.1 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 52.5 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 53.7 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 53.8 nsec per loop

def f(a, *, b): pass

Python 3.6.8
5000000 loops, best of 3: 0.0528 usec per loop
Python 3.7.6
5000000 loops, best of 5: 51.2 nsec per loop
Python 3.8.3
5000000 loops, best of 5: 51.4 nsec per loop
Python 3.9.0
5000000 loops, best of 5: 52.4 nsec per loop
Python 3.10.0a2+
5000000 loops, best of 5: 55.7 nsec per loop
Python 3.10.0a2+ with compact representation
5000000 loops, best of 5: 53.7 nsec per loop
```
msg381818 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-11-25 10:43
New changeset 7301979b23406220510dd2c7934a21b41b647119 by Yurii Karabas in branch 'master':
bpo-42202: Store func annotations as a tuple (GH-23316)
https://github.com/python/cpython/commit/7301979b23406220510dd2c7934a21b41b647119
History
Date User Action Args
2020-11-25 10:44:33methanesetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-11-25 10:43:30methanesetmessages: + msg381818
2020-11-19 02:49:39methanesetnosy: + lukasz.langa
2020-11-18 09:36:47uriyyosetmessages: + msg381320
2020-11-18 00:50:23methanesetmessages: + msg381309
2020-11-17 18:51:10uriyyosetmessages: + msg381272
2020-11-17 17:56:49serhiy.storchakasetmessages: + msg381264
2020-11-17 16:47:32jstasiaksetmessages: + msg381255
2020-11-17 16:04:56uriyyosetmessages: + msg381253
2020-11-17 15:43:31Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg381251
2020-11-17 01:17:12methanesetmessages: + msg381200
2020-11-16 19:57:40jstasiaksetnosy: + jstasiak
2020-11-16 16:26:10uriyyosetkeywords: + patch
nosy: + uriyyo

pull_requests: + pull_request22207
stage: patch review
2020-10-30 11:25:39serhiy.storchakasetmessages: + msg379947
2020-10-30 08:32:03methanesetmessages: + msg379936
2020-10-30 07:43:45corona10setmessages: + msg379930
2020-10-30 07:43:21corona10setnosy: + serhiy.storchaka
2020-10-30 07:40:50corona10setnosy: + corona10, - serhiy.storchaka
messages: + msg379929
2020-10-30 07:39:24serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg379928
2020-10-30 06:47:15methanecreate