Message 336018 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	nascheme
Recipients	nascheme
Date	2019-02-19.20:39:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1550608790.89.0.306917505619.issue36044@roundup.psfhosted.org>
In-reply-to

Content
I was doing some 'perf' profiling runs of Python. I decided to try running PROFILE_TASK to see what the profile looks like. I was surprised that gc_collect dominated runtime: Children Self Symbol + 93.93% 6.00% [.] _PyEval_EvalFrameDefault + 76.19% 0.12% [.] function_code_fastcall + 70.65% 0.31% [.] _PyMethodDef_RawFastCallKeywords + 63.24% 0.13% [.] _PyCFunction_FastCallKeywords + 58.67% 0.36% [.] _PyEval_EvalCodeWithName + 57.45% 23.84% [.] collect + 52.89% 0.00% [.] gc_collect + 52.10% 0.08% [.] _PyFunction_FastCallDict + 41.99% 0.02% [.] _PyObject_Call_Prepend + 36.37% 0.18% [.] _PyFunction_FastCallKeywords + 20.94% 0.07% [.] _PyObject_FastCallDict + 19.64% 0.00% [.] PyObject_Call + 17.74% 0.11% [.] _PyObject_FastCallKeywords + 12.45% 0.00% [.] slot_tp_call + 12.27% 4.05% [.] dict_traverse + 11.45% 11.04% [.] visit_reachable + 11.18% 10.76% [.] visit_decref + 9.65% 0.11% [.] type_call + 8.80% 0.83% [.] func_traverse + 7.78% 0.08% [.] _PyMethodDescr_FastCallKeywords Part of the problem is that we run full cyclic GC for every test. I.e. cleanup_test_droppings() calls gc.collect(). Maybe we could make these calls conditional on the --pgo flag of regtest. Or, maybe we need to re-evaluate if running the unit test suite is the best way to generate PGO trace data. Based on a tiny bit of further investigation, it looks like gc_collect() is getting called quite a lot of times, in addition to cleanup_test_droppings(). Maybe there is some low-hanging fruit here for optimization. Full GC is pretty expensive.

I was doing some 'perf' profiling runs of Python.  I decided to try running PROFILE_TASK to see what the profile looks like.  I was surprised that gc_collect dominated runtime:

  Children      Self  Symbol
+   93.93%     6.00%  [.] _PyEval_EvalFrameDefault                
+   76.19%     0.12%  [.] function_code_fastcall                  
+   70.65%     0.31%  [.] _PyMethodDef_RawFastCallKeywords        
+   63.24%     0.13%  [.] _PyCFunction_FastCallKeywords           
+   58.67%     0.36%  [.] _PyEval_EvalCodeWithName                
+   57.45%    23.84%  [.] collect                                 
+   52.89%     0.00%  [.] gc_collect                              
+   52.10%     0.08%  [.] _PyFunction_FastCallDict                
+   41.99%     0.02%  [.] _PyObject_Call_Prepend                  
+   36.37%     0.18%  [.] _PyFunction_FastCallKeywords            
+   20.94%     0.07%  [.] _PyObject_FastCallDict                  
+   19.64%     0.00%  [.] PyObject_Call                           
+   17.74%     0.11%  [.] _PyObject_FastCallKeywords              
+   12.45%     0.00%  [.] slot_tp_call                            
+   12.27%     4.05%  [.] dict_traverse                           
+   11.45%    11.04%  [.] visit_reachable                         
+   11.18%    10.76%  [.] visit_decref                            
+    9.65%     0.11%  [.] type_call                               
+    8.80%     0.83%  [.] func_traverse                           
+    7.78%     0.08%  [.] _PyMethodDescr_FastCallKeywords     

Part of the problem is that we run full cyclic GC for every test.  I.e. cleanup_test_droppings() calls gc.collect().  Maybe we could make these calls conditional on the --pgo flag of regtest.  Or, maybe we need to re-evaluate if running the unit test suite is the best way to generate PGO trace data.

Based on a tiny bit of further investigation, it looks like gc_collect() is getting called quite a lot of times, in addition to cleanup_test_droppings().  Maybe there is some low-hanging fruit here for optimization.  Full GC is pretty expensive.

History
Date	User	Action	Args
2019-02-19 20:39:50	nascheme	set	recipients: + nascheme
2019-02-19 20:39:50	nascheme	set	messageid: <1550608790.89.0.306917505619.issue36044@roundup.psfhosted.org>
2019-02-19 20:39:50	nascheme	link	issue36044 messages
2019-02-19 20:39:50	nascheme	create