classification
Title: Generic type subscription is a huge toll on Python performance
Type: performance Stage: resolved
Components: Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ruslan Dautkhanov, Wouter De Borger, ZackerySpytz, calebj, davidism, gvanrossum, levkivskyi, miss-islington, navdevl
Priority: normal Keywords: patch

Created on 2019-12-30 17:17 by Ruslan Dautkhanov, last changed 2020-11-16 00:12 by gvanrossum. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 21327 merged ZackerySpytz, 2020-07-05 03:59
PR 21335 merged miss-islington, 2020-07-05 11:00
Messages (20)
msg359049 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-30 17:17
Reported originally here - 
https://twitter.com/__zero323__/status/1210911632953692162

See details here
https://asciinema.org/a/290643

In [4]: class Foo: pass
In [5]: %timeit -n1_000_000 Foo()
88.5 ns ± 3.44 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [6]: T = TypeVar("T")
In [7]: class Bar(Generic[T]): pass
In [8]: %timeit -n1_000_000 Bar()
883 ns ± 3.46 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Same effect in Python 3.6 and 3.8
msg359050 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-30 17:17
Python typing gives an order of magnitude slow down in this case
msg359061 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-30 21:53
In [12]: cProfile.run("for _ in range(100_000): Bar()")                                                                                                                                                            
         200003 function calls in 0.136 seconds                                                                                                                                                                    
                                                                                                                                                                                                                   
   Ordered by: standard name                                                                                                                                                                                       
                                                                                                                                                                                                                   
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)                                                                                                                                            
        1    0.047    0.047    0.136    0.136 <string>:1(<module>)                                                                                                                                                 
   100000    0.079    0.000    0.089    0.000 typing.py:865(__new__)                                                                                                                                               
   100000    0.010    0.000    0.010    0.000 {built-in method __new__ of type object at 0x55ab65861ac0}                                                                                                           
        1    0.000    0.000    0.136    0.136 {built-in method builtins.exec}                                                                                                                                      
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}                                                                                                                     
                                                                                                                                                                                                                   
                                                                                                                                                                                                                   
                                                                                                                                                                                                                   
In [13]: # And typing.py:865 points to                                                                                                                                                                             
                                                                                                                                                                                                                   
In [14]: inspect.getsourcelines(Generic.__new__)                                                                                                                                                                   
Out[14]:                                                                                                                                                                                                           
(['    def __new__(cls, *args, **kwds):\n',                                                                                                                                                                        
  '        if cls in (Generic, Protocol):\n',                                                                                                                                                                      
  '            raise TypeError(f"Type {cls.__name__} cannot be instantiated; "\n',                                                                                                                                 
  '                            "it can be used only as a base class")\n',                                                                                                                                          
  '        if super().__new__ is object.__new__ and cls.__init__ is not object.__init__:\n',                                                                                                                       
  '            obj = super().__new__(cls)\n',                                                                                                                                                                      
  '        else:\n',                                                                                                                                                                                               
  '            obj = super().__new__(cls, *args, **kwds)\n',                                                                                                                                                       
  '        return obj\n'],                                                                                                                                                                                         
 865)
msg359072 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 00:10
What Python version was used for the timings? If not 3.8, please do over in 3.8.
msg359074 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 00:39
Sorry, you already said 3.6 and 3.8 give the same effect. But what if you add a minimal __new__() to Foo?
msg359081 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 01:31
Hm, here's what I measure in Python 3.8.1. (I don't use IPython or notebooks so this looks a little different.)


>>> timeit.timeit('Foo()', 'class Foo: pass')
0.37630256199999934


>>> timeit.timeit('Foo()', 'class Foo:\n  def __new__(cls): return super().__new__(cls)')
1.5753196039999864


>>> timeit.timeit('Foo()', 'from typing import Generic, TypeVar\nT = TypeVar("T")\nclass Foo(Generic[T]): pass')
3.8748737150000068


From this I conclude that adding a minimal __new__() method is responsible for about 4x slowdown, and the functionality in typing.py for another factor 2.5.


While this isn't great I don't see an easy way to improve upon this without rewriting the entire typing module in C.  (Some of this may or may not happen for PEP 604.)

PS. I just realized my Python binary was built with debug options, so absolute numbers will look different (better) for you -- but relative numbers will look the same, and I get essentially the same factors with 3.9.0a1+.
msg359124 - (view) Author: Ivan Levkivskyi (levkivskyi) * (Python committer) Date: 2019-12-31 19:56
This issue came up few times before (although I can't find an issue here on b.p.o., maybe it was on typing-sig list). Although in micro-benchmarks the impact may seem big, in vast majority of applications it is rarely more that a percent or so.

On the other hand, IIRC the only reason `Generic.__new__()` exists is so that one can't write `Generic()` (i.e. instantiate a plain `Generic`). I would be totally fine if we just remove it in 3.9. Hopefully, people already learned what typing is for and don't need so much "protection" against not very meaningful things. Also, the error can be given by static type checkers, there is probably no need for a runtime error.
msg359126 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 19:59
If that solves the perf issue I am fine with it.
msg359127 - (view) Author: Ivan Levkivskyi (levkivskyi) * (Python committer) Date: 2019-12-31 19:59
OK, here is the original issue https://github.com/python/typing/issues/681. I asked the author to open an issue here instead, but likely they didn't open one.
msg359129 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-31 21:43
Thank you Guido and Ivan
msg359130 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-12-31 21:44
OK let’s do it. Clearly for *some* applications the overhead is significant.
-- 
--Guido (mobile)
msg359131 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-31 21:48
Perhaps the check should only be done in some sort of Python development mode and off by default?
msg359132 - (view) Author: Ruslan Dautkhanov (Ruslan Dautkhanov) Date: 2019-12-31 21:49
Didn't see your last response before submitting an update.

That's great you have a plan how to resolve this! 

Thanks again
msg373013 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-07-05 04:48
Should this be backported? How far back?
msg373014 - (view) Author: miss-islington (miss-islington) Date: 2020-07-05 05:07
New changeset 7fed75597fac11f9a6c769e2b6c6548fe0e4049d by Zackery Spytz in branch 'master':
bpo-39168: Remove the __new__ method of typing.Generic (GH-21327)
https://github.com/python/cpython/commit/7fed75597fac11f9a6c769e2b6c6548fe0e4049d
msg373036 - (view) Author: miss-islington (miss-islington) Date: 2020-07-05 16:02
New changeset 5a1384935ee8996a5bd240dd29f9b5e356cfc467 by Miss Islington (bot) in branch '3.9':
bpo-39168: Remove the __new__ method of typing.Generic (GH-21327)
https://github.com/python/cpython/commit/5a1384935ee8996a5bd240dd29f9b5e356cfc467
msg381036 - (view) Author: David Lord (davidism) Date: 2020-11-15 21:37
Is this performance issue supposed to be fixed in 3.9? I'm still observing severe slowdown by inheriting from `Generic[T]`.

I'm currently adding typing to Werkzeug, where we define many custom data structures such as `MultiDict`. It would be ideal for these classes to be recognized as generic mappings. I remembered hearing about this performance issue somewhere, so I decided to test what happens.

Here's a minimal example without Werkzeug, the results in Werkzeug are similar or worse. I'd estimate each request creates about 10 of the various data structures, which are then accessed by user code, so I simulated that by creating and iterating a list of objects.

```python
class Test:
    def __init__(self, value):
        self.value = value

def main():
    ts = [Test(x) for x in range(10)]
    sum(t.value for t in ts)
```

```
$ python3.9 -m timeit -n 100000 -s 'from example import main' 'main()'
100000 loops, best of 5: 7.67 usec per loop
```

```python
import typing

V = typing.TypeVar("V")

class Test(typing.Generic[V]):
    def __init__(self, value: V) -> None:
        self.value = value

def main():
    ts = [Test(x) for x in range(10)]
    sum(t.value for t in ts)
```

```
$ python3.9 -m timeit -n 100000 -s 'from example import main' 'main()'
100000 loops, best of 5: 18.2 usec per loop
```

There is more than a 2x slowdown when using `Generic`. The timings (7 vs 18 usec) are the same across Python 3.6, 3.7, 3.8, and 3.9. It seems that 3.9 does not fix the performance issue.

Since we currently support Python 3.6+, I probably won't be able to use generics anyway due to the performance in those versions, but I wanted to make sure I'm not missing something with 3.9.
msg381041 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-15 23:01
@davidm

I don't see such a dramatic difference -- the generic version is a tad slower, but the difference is less than the variation between runs.

What platform are you using?  (I'm doing this on Windows.)
msg381046 - (view) Author: David Lord (davidism) Date: 2020-11-15 23:59
I'm using Arch Linux. After your reply I tried again and now I'm seeing the same result as you, negligible difference from inheriting `Generic` on Python 3.9. I can't explain it, I ran the timings repeatedly before I posted here, but I guess it was a weird temporary issue with my machine.
msg381047 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-16 00:12
No worries. I tend to run each time it command at least three times before I trust the numbers. Professional bench markers also configure a machine without background tasks (email etc.).
History
Date User Action Args
2020-11-16 00:12:17gvanrossumsetmessages: + msg381047
2020-11-15 23:59:28davidismsetmessages: + msg381046
2020-11-15 23:01:27gvanrossumsetmessages: + msg381041
2020-11-15 21:37:52davidismsetnosy: + davidism
messages: + msg381036
2020-07-05 16:03:53gvanrossumsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-07-05 16:02:45miss-islingtonsetmessages: + msg373036
2020-07-05 11:00:17miss-islingtonsetpull_requests: + pull_request20483
2020-07-05 05:07:48miss-islingtonsetnosy: + miss-islington
messages: + msg373014
2020-07-05 04:48:49gvanrossumsetmessages: + msg373013
2020-07-05 03:59:10ZackerySpytzsetkeywords: + patch
nosy: + ZackerySpytz

pull_requests: + pull_request20477
stage: patch review
2020-01-13 08:36:56Wouter De Borgersetnosy: + Wouter De Borger
2020-01-12 00:38:14calebjsetnosy: + calebj
2020-01-02 08:42:25navdevlsetnosy: + navdevl
2019-12-31 21:49:16Ruslan Dautkhanovsetmessages: + msg359132
2019-12-31 21:48:07Ruslan Dautkhanovsetmessages: + msg359131
2019-12-31 21:44:56gvanrossumsetmessages: + msg359130
2019-12-31 21:43:08Ruslan Dautkhanovsetmessages: + msg359129
2019-12-31 19:59:12levkivskyisetmessages: + msg359127
2019-12-31 19:59:02gvanrossumsetmessages: + msg359126
2019-12-31 19:56:41levkivskyisetmessages: + msg359124
2019-12-31 01:31:15gvanrossumsetmessages: + msg359081
2019-12-31 00:39:23gvanrossumsetmessages: + msg359074
2019-12-31 00:10:29gvanrossumsetmessages: + msg359072
2019-12-30 21:53:54Ruslan Dautkhanovsetmessages: + msg359061
2019-12-30 18:07:03ned.deilysetnosy: + gvanrossum, levkivskyi
2019-12-30 17:17:52Ruslan Dautkhanovsetmessages: + msg359050
2019-12-30 17:17:10Ruslan Dautkhanovcreate