classification
Title: inspect.getsource performs unnecessary filesystem stat call
Type: performance Stage: patch review
Components: Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Pankaj Pandey, taleinat, yselivanov
Priority: normal Keywords:

Created on 2018-05-20 16:38 by Pankaj Pandey, last changed 2018-06-12 13:06 by Pankaj Pandey.

Files
File name Uploaded Description Edit
inspect_stack_perf.py Pankaj Pandey, 2018-05-20 16:38 Test file to compare performance
Pull Requests
URL Status Linked Edit
PR 6805 open Pankaj Pandey, 2018-05-20 16:38
Messages (4)
msg317186 - (view) Author: Pankaj Pandey (Pankaj Pandey) * Date: 2018-05-20 16:38
The specific os.path.exists() call is shown here: https://github.com/python/cpython/pull/6805

If the filename is already in linecache or the module has a PEP 302 loader, then the os.path.exists() call is ignored. Hence it is better to call it only if its value is needed.

Attached a very simple case which shows the impact of the change:
msg317189 - (view) Author: Pankaj Pandey (Pankaj Pandey) * Date: 2018-05-20 16:57
Here's the patch performance difference before and after the patch:

Before:

Sun May 20 21:42:32 2018    prof1.stat

         1188991 function calls (1188851 primitive calls) in 4.821 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     18/1    0.006    0.000    4.824    4.824 {built-in method builtins.exec}
        1    0.000    0.000    4.824    4.824 inspect_stack_perf.py:1(<module>)
        1    0.001    0.001    4.790    4.790 inspect_stack_perf.py:22(load_nodes)
      421    0.001    0.000    4.787    0.011 inspect_stack_perf.py:5(__init__)
      421    0.001    0.000    4.786    0.011 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1492(stack)
      421    0.011    0.000    4.785    0.011 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1464(getouterframes)
     4630    0.031    0.000    4.770    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1425(getframeinfo)
      420    0.001    0.000    4.739    0.011 inspect_stack_perf.py:9(add_child)
    13994    4.159    0.000    4.159    0.000 {built-in method nt.stat}
     4630    0.042    0.000    3.223    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:760(findsource)
     9322    0.043    0.000    2.960    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:680(getsourcefile)
     9322    0.011    0.000    2.832    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\genericpath.py:16(exists)
     4630    0.016    0.000    1.339    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\linecache.py:53(checkcache)
     4630    0.097    0.000    0.364    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:714(getmodule)
     4692    0.008    0.000    0.135    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:702(getabsfile)
     4754    0.008    0.000    0.091    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:538(abspath)
     4754    0.036    0.000    0.074    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:471(normpath)
   162600    0.071    0.000    0.071    0.000 {built-in method builtins.hasattr}



After the patch:

Sun May 20 21:39:44 2018    prof2.stat

         2639991 function calls (2639727 primitive calls) in 2.841 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     18/1    0.005    0.000    2.844    2.844 {built-in method builtins.exec}
        1    0.000    0.000    2.844    2.844 inspect_stack_perf.py:1(<module>)
        1    0.001    0.001    2.814    2.814 inspect_stack_perf.py:22(load_nodes)
      421    0.001    0.000    2.812    0.007 inspect_stack_perf.py:5(__init__)
      421    0.001    0.000    2.811    0.007 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1492(stack)
      421    0.010    0.000    2.810    0.007 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1464(getouterframes)
      420    0.001    0.000    2.802    0.007 inspect_stack_perf.py:9(add_child)
     4630    0.029    0.000    2.795    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:1425(getframeinfo)
     4630    0.040    0.000    2.380    0.001 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:760(findsource)
     4674    1.631    0.000    1.631    0.000 {built-in method nt.stat}
     4630    0.014    0.000    1.630    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\linecache.py:53(checkcache)
13952/13890    0.281    0.000    0.907    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:714(getmodule)
9322/9260    0.038    0.000    0.703    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:680(getsourcefile)
    13952    0.018    0.000    0.259    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\inspect.py:702(getabsfile)
    14014    0.017    0.000    0.215    0.000 C:\Users\ppandey\AppData\Local\Continuum\Anaconda3\lib\ntpath.py:538(abspath)
   478166    0.177    0.000    0.177    0.000 {built-in method builtins.hasattr}


Total runtime reduced from 4.8 s to 2.8 s, and the major difference can be seen in the `nt.stat` call. This is on a machine with SSD. In workplace where python is installed in cluster on nfs, I'm sure the performance is worse.
msg319213 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-06-10 12:04
Moving the linecache check up before the two others seems like a simple and safe optimization. On the other hand, I'm not sure calling getmodule() before checking if the file exists would be faster in many cases.

Pankaj, could you also include the script you ran to profile this and the command line by which it was invoked?
msg319379 - (view) Author: Pankaj Pandey (Pankaj Pandey) * Date: 2018-06-12 13:06
I simply profiled the script with "python -m cProfile -s cumtime inspect_stack_perf.py"
And yes, I should move the linecache check first, that would be a better idea. Though getmodule() is also fast enough and reordering it does not significantly change the performance (for the common case of no PEP302 modules).
History
Date User Action Args
2018-06-12 13:06:48Pankaj Pandeysetmessages: + msg319379
2018-06-10 12:04:56taleinatsetnosy: + taleinat
messages: + msg319213
2018-05-25 18:08:34terry.reedysetstage: patch review
type: performance
versions: + Python 3.8
2018-05-25 18:07:59terry.reedysetnosy: + yselivanov

title: inspect.getsource performs unnecessary filesystem stat call which is ignored in some circumstances -> inspect.getsource performs unnecessary filesystem stat call
2018-05-20 16:57:17Pankaj Pandeysetmessages: + msg317189
2018-05-20 16:38:49Pankaj Pandeycreate