Issue6033
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009-05-16 01:43 by benjamin.peterson, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
call_method.patch | benjamin.peterson, 2009-05-16 01:43 |
Messages (10) | |||
---|---|---|---|
msg87850 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2009-05-16 01:43 | |
This is an optimization ported from PyPy. [1] It tries to prevent bound methods from being created by using the stack as a cache. I couldn't apply this to builtin methods because those use a method-wrapper descriptor. The results were not very impressive. However, I'm attaching the patch to see if anyone else wants to look at it. [1] http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id1 Test minimum run-time average run-time this other diff this other diff ------------------------------------------------------------------------------- BuiltinFunctionCalls: 342ms 340ms +0.6% 378ms 361ms +4.7% BuiltinMethodLookup: 315ms 308ms +2.3% 333ms 319ms +4.5% CompareFloats: 250ms 251ms -0.3% 257ms 258ms -0.7% CompareFloatsIntegers: 266ms 265ms +0.4% 273ms 273ms -0.2% CompareIntegers: 233ms 232ms +0.4% 238ms 238ms -0.2% CompareInternedStrings: 279ms 275ms +1.3% 285ms 284ms +0.3% CompareLongs: 225ms 223ms +1.1% 231ms 229ms +0.8% CompareStrings: 238ms 235ms +1.3% 244ms 243ms +0.6% CompareUnicode: 243ms 246ms -0.9% 252ms 252ms -0.1% ComplexPythonFunctionCalls: 307ms 301ms +2.0% 315ms 309ms +1.8% ConcatStrings: 372ms 366ms +1.6% 376ms 385ms -2.1% ConcatUnicode: 260ms 259ms +0.4% 266ms 269ms -0.9% CreateInstances: 351ms 336ms +4.7% 365ms 346ms +5.5% CreateNewInstances: 265ms 256ms +3.6% 281ms 264ms +6.5% CreateStringsWithConcat: 290ms 289ms +0.1% 304ms 301ms +1.2% CreateUnicodeWithConcat: 220ms 219ms +0.8% 227ms 223ms +1.8% DictCreation: 201ms 200ms +0.4% 204ms 206ms -1.0% DictWithFloatKeys: 400ms 418ms -4.4% 410ms 424ms -3.4% DictWithIntegerKeys: 298ms 294ms +1.2% 306ms 304ms +0.6% DictWithStringKeys: 260ms 264ms -1.5% 270ms 275ms -2.1% ForLoops: 224ms 223ms +0.2% 232ms 232ms +0.2% IfThenElse: 160ms 160ms +0.0% 168ms 182ms -8.0% ListSlicing: 293ms 292ms +0.5% 302ms 306ms -1.2% NestedForLoops: 301ms 300ms +0.2% 305ms 308ms -0.8% NestedListComprehensions: 323ms 328ms -1.7% 331ms 335ms -1.3% NormalClassAttribute: 313ms 314ms -0.1% 323ms 330ms -2.1% NormalInstanceAttribute: 284ms 283ms +0.4% 289ms 288ms +0.4% PythonFunctionCalls: 259ms 278ms -6.7% 274ms 289ms -5.3% PythonMethodCalls: 358ms 357ms +0.3% 371ms 365ms +1.6% Recursion: 389ms 398ms -2.1% 395ms 407ms -2.9% SecondImport: 335ms 319ms +5.2% 346ms 380ms -8.9% SecondPackageImport: 338ms 326ms +3.6% 350ms 337ms +4.0% SecondSubmoduleImport: 413ms 403ms +2.5% 426ms 411ms +3.6% SimpleComplexArithmetic: 341ms 345ms -1.2% 351ms 355ms -1.1% SimpleDictManipulation: 288ms 298ms -3.7% 293ms 303ms -3.1% SimpleFloatArithmetic: 272ms 275ms -1.1% 279ms 286ms -2.7% SimpleIntFloatArithmetic: 211ms 204ms +3.3% 216ms 215ms +0.5% SimpleIntegerArithmetic: 207ms 203ms +1.7% 214ms 213ms +0.5% SimpleListComprehensions: 275ms 273ms +0.6% 281ms 281ms -0.1% SimpleListManipulation: 224ms 229ms -2.5% 234ms 241ms -2.9% SimpleLongArithmetic: 252ms 253ms -0.6% 263ms 266ms -1.0% SmallLists: 290ms 301ms -3.8% 299ms 311ms -3.9% SmallTuples: 254ms 253ms +0.3% 261ms 266ms -1.9% SpecialClassAttribute: 311ms 309ms +0.7% 320ms 321ms -0.2% SpecialInstanceAttribute: 358ms 358ms +0.1% 370ms 371ms -0.4% StringMappings: 817ms 833ms -1.9% 823ms 852ms -3.4% StringPredicates: 488ms 538ms -9.2% 495ms 547ms -9.4% StringSlicing: 295ms 296ms -0.2% 306ms 323ms -5.5% TryExcept: 282ms 280ms +1.0% 291ms 288ms +1.0% TryFinally: 290ms 255ms +14.1% 300ms 263ms +14.1% TryRaiseExcept: 261ms 256ms +1.7% 271ms 263ms +3.1% TupleSlicing: 281ms 270ms +4.3% 289ms 277ms +4.3% UnicodeMappings: 329ms 337ms -2.4% 337ms 348ms -3.0% UnicodePredicates: 295ms 329ms -10.4% 303ms 338ms -10.5% UnicodeProperties: 272ms 307ms -11.4% 284ms 313ms -9.3% UnicodeSlicing: 258ms 261ms -1.2% 266ms 275ms -3.2% WithFinally: 384ms 398ms -3.6% 394ms 408ms -3.5% WithRaiseExcept: 320ms 294ms +8.9% 336ms 304ms +10.6% ------------------------------------------------------------------------------- |
|||
msg87870 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2009-05-16 09:13 | |
Can you give results of the "richards" benchmark? |
|||
msg87889 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2009-05-16 12:49 | |
With the patch: Richards benchmark (Python) starting... [<function entry_point at 0x63b430>] finished. Total time for 10 iterations: 8.49 secs Average time per iteration: 848.90 ms Without: Richards benchmark (Python) starting... [<function entry_point at 0x637530>] finished. Total time for 10 iterations: 10.46 secs Average time per iteration: 1045.88 ms |
|||
msg87896 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2009-05-16 14:25 | |
Similar results here. With the patch: Richards benchmark (Python) starting... [<function entry_point at 0x7fc15b3f5848>] finished. Total time for 4 iterations: 1.78 secs Average time per iteration: 443.90 ms Without the patch: Richards benchmark (Python) starting... [<function entry_point at 0x7f8b9fb20848>] finished. Total time for 4 iterations: 2.02 secs Average time per iteration: 503.79 ms |
|||
msg95707 - (view) | Author: Reid Kleckner (rnk) | Date: 2009-11-25 02:21 | |
One thing I was wondering about the current patch is what about objects that have attributes that shadow methods? For example: class C(object): def foo(self): return 1 c = c() print c.foo() c.foo = lambda: 2 print c.foo() Shouldn't the above print 1 and 2? With the current patch, it seems that you might still print 1. There's also the possible performance drawback where you're loading builtin C methods, so the optimization fails, but you end up calling _PyType_Lookup twice. :( I'm doing the same optimization for unladen swallow, and these were some of the things I ran into. I think I'm going to write a PyObject_GetMethod that tries to get a method without binding it, but if it can't for any reason, it behaves just like PyObject_GetAttr and sets a status code. |
|||
msg95710 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2009-11-25 03:27 | |
Yes, my patch introduces that regression you mention. PyPy solves this by having the instances dictionary keep track of shadowing of the type dictionary. Not easy for CPython... I wish you luck on your patch! |
|||
msg110253 - (view) | Author: Reid Kleckner (rnk) | Date: 2010-07-14 04:56 | |
I have an patch for unladen-swallow out for review here: http://codereview.appspot.com/160063/show It resolves the correctness issues I mentioned previously by emitting guards if necessary. If the type is predictable and uses slots, then we don't need to check the instance dict. It gives a 5% speedup on the unpickle benchmark. Presumably the other benchmarks do not do as many method calls. |
|||
msg110272 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2010-07-14 12:08 | |
First, are these benchmark results jitted or non-jitted? Right now, non-jitted results are a stronger motivation for inclusion in main CPython, IMHO. Second, 2.7 won't receive any features / performance improvements anymore. It would be nice to have 3.2 (non-jitted) benchmark results. Third, if removing intermediate allocations is the kind of optimizations a JIT will do anyway, does it make sense or not to make these optimizations explicit at the bytecode level? (this is really an open question, not a rhetorical one) |
|||
msg110298 - (view) | Author: Reid Kleckner (rnk) | Date: 2010-07-14 16:45 | |
Sorry, I was just posting it so Benjamin could see what this bought us. I'm not pushing to get this in CPython. The results are for JITed code. I forget what the interpreted results are. I think they are good for the microbenchmarks, but not as good for the macro. |
|||
msg168170 - (view) | Author: R. David Murray (r.david.murray) * | Date: 2012-08-14 03:39 | |
Benjamin confirms a regression in his patch, and the other patch was not intended for CPython. So I'm closing this issue. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:48 | admin | set | github: 50283 |
2016-01-14 17:55:24 | yselivanov | set | superseder: Speedup method calls 1.2x |
2012-08-14 03:39:29 | r.david.murray | set | status: open -> closed nosy: + r.david.murray messages: + msg168170 stage: patch review -> resolved |
2010-07-14 16:45:22 | rnk | set | messages: + msg110298 |
2010-07-14 12:11:47 | pitrou | set | stage: patch review versions: - Python 2.7 |
2010-07-14 12:08:10 | pitrou | set | nosy:
+ collinwinter, jyasskin messages: + msg110272 |
2010-07-14 04:56:43 | rnk | set | messages: + msg110253 |
2009-11-25 03:27:40 | benjamin.peterson | set | messages: + msg95710 |
2009-11-25 02:21:16 | rnk | set | messages: + msg95707 |
2009-11-24 17:15:04 | rnk | set | nosy:
+ rnk |
2009-05-16 14:25:05 | pitrou | set | messages: + msg87896 |
2009-05-16 12:49:25 | benjamin.peterson | set | messages: + msg87889 |
2009-05-16 09:13:54 | pitrou | set | nosy:
+ pitrou messages: + msg87870 |
2009-05-16 01:43:58 | benjamin.peterson | create |