New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast globals/builtins access (patch) #45859
Comments
I've attached a patch that reduces LOAD_GLOBAL access time by 19% for The approach is to cache pointers to dictionary entries rather than A script of microbenchmarks, fastglobals_test.py, is included in the Test 2.6a0 trunk 2.6a0 fastglobals % time All regressions that aren't skipped pass except test_gc. I've probably |
When I run the code of test_gc.py test_function() in a shell I'm getting gc: collectable <dict 0xb78aa13c> However the same code embedded in the test suite collects two additional gc: collectable <dict 0x830061c> I've used gc.set_debug(gc.DEBUG_LEAK) to get the information |
In funcobject.c:PyFunction_New, the declarations of op and __name__ need In the test script, the use of import * generates syntax warnings that I see roughly the same speed ups (MSVC 7.1, Windows XP, Intel Core2 Duo Test Trunk fastglobals Time difference PyBench shows an overall slowdown - String mappings in particular, I notice that you create a new PyFastGlobals object in every call to |
I may have had some old code in the build or something - I did a clean |
Christian: Thanks for that, I'll have to remember DEBUG_LEAK in the It looks like it may be acting correctly since there *is* now a Chris: The build problems should be fixed in the next patch. Thanks for Regarding PyEval_EvalCode creating a PyFastGlobalsObject: I'm not sure |
I've attached the latest patch, which should fix the build and compile The patch also includes a new fastglobals_test.py, which has a few extra Benchmark results for both of my systems are here: http://spreadsheets.google.com/ccc?key=pHIJrYc_pnIVGXyUxWYFkLw&hl=en_US This covers pystones, pyfastglobals_test, and pybench. In the latter Pystones is significantly faster. The other two show that the operations I added the three new parser minibenchmarks to see what would happen to
What I mean by #2 is that _PyType_Lookup (looking up a method attribute) Not today, though. :) |
The proposed approach to speeding up lookup of inherited methods is not class A:
def f(x): ...
class B(A):
pass
class C(B):
pass If C caches a pointer to A.f, the wrong thing will happen if B.f is I thought a sufficient fix would be for classes to increment not only class A:
def f(x): ...
class B(A):
pass
class C(B):
pass
class D:
pass
class E(D,C):
pass If D.f is defined dynamically, E's cached pointer to C.f will retrieve I haven't encountered this issue before in a system with multiple base |
Sorry, I wrote "E's cached pointer to C.f", which of course should be |
I've gone over this at a high-level and looked for various outstanding I think this is very interesting and hope that we can get it working, In Include/fastglobals.h, num_entries says it is len(names). But isbuiltin is an array of integers that hold boolean values? So most Did you measure the difference in memory size? For example the size Is there ever an access to the old globals in the old code? I wonder Does PyFastGlobals_GET_ITEM_INPLACE really need to be a macro? It's a I'm thinking that entryversion should be unsigned. Signed integer I like the removal of one pointer from the frame object. How does this interact with multiple interpreters? I see the static Regarding: How about the loop is limited to 10 runs or something reasonable? Since PyFastGlobals_Update() is public, you should not assert that the In fg_check_builtins() you may be able to speed this up by checking I don't understand this comment in PyFastGlobals_CheckBuiltins: It says what the code does, but I don't understand *why* it does it. Instead of:
PyErr_NoMemory();
Py_DECREF(newnames);
return NULL;
You can do:
Py_DECREF(newnames);
return PyErr_NoMemory(); In PyFastGlobals_New(), you should verify that num_entries doesn't In this code:
isbuiltin = PyMem_NEW(int, num_entries);
if (isbuiltin == NULL) {
PyErr_NoMemory();
Py_DECREF(newnames);
PyMem_FREE(isbuiltin);
return NULL;
} The free of isbuiltin should be entries. If these arrays will be small (< 256 bytes), it would be better (faster) PyDict_GetEntry should be prefixed with an underscore. I don't think This needs to be addressed: |
Making sure I look at this at least once carefully before releasing. |
Ping. |
Won't have time myself. |
There probably isn't any point in such a complication, until perhaps we have a JIT that magnifies the improvement. |
Antoine: yes, this optimization is already implemented in the Unladen JIT. |
Thanks Collin, recommend closing then. |
Closing as recommended in msg101239. |
It sort of looks like this was closed because we assumed we were moving to Unladen Swallow. We're not. Should this be reopened? |
See bpo-10401. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: