New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup types.coroutine() #68513
Comments
Attached patch provides an implementation (part of it) of types.coroutine in C. The problem with the current pure Python implementation is that it copies the code object of the generator function, which is a small overhead during import. I'm not sure if this should be merged in 3.5 at all. Please take a look at the patch. |
Attached is the second iteration of the patch. Now, besides just speeding up types.coroutine() for pure python generator functions, it also provides a better wrapper around generator-like objects. |
This looks big and complicated. I'd prefer this skipped 3.5 and just went into 3.6. |
I would still like to see a patch in Py3.5 that only flips the bit in C when "_types" is available and otherwise falls back to the existing Python code that creates a new code object. This part is much cleaner and faster to do in C than in the current Python implementation. |
Larry, can you accept the first version of the patch (only function that patches code object's co_flags) after beta2? I'm OK if you think it should only be committed in 3.6, but I also agree with Stefan, that using C is better in this particular case. |
Since it's a speedup it could also go into 3.5.1.
|
+1 from me for merging this for 3.5.0 and deferring bpo-24468 (which now proposes making _opcode a builtin module to allow compiler constants to be easily shared between C code and Python code) to 3.6 instead. The design changes to address bpo-24400 cleaned up various aspect of both the internal architecture and the public data model of PEP-492, and the latest draft of this patch benefits accordingly. |
A rebased version of the patch is attached (now a "review" link should appear). Nick, Stefan, please take a look. Larry, can we merge this in 3.5.0? I've invested a lot of time to have 100% test coverage; the test suite is very elaborate now. There seems to be no refleaks too. I think this change is a very natural continuation of what we've done in bpo-24400. |
New changeset eb6fb8e2f995 by Yury Selivanov in branch '3.5': New changeset 7a2a79362bbe by Yury Selivanov in branch 'default': |
New changeset 9aee273bf8b7 by Yury Selivanov in branch '3.5': New changeset fa097a336079 by Yury Selivanov in branch 'default': |
I've committed new unittests from this patch (as they are applicable to pure Python implementation of the wrapper too) The patch now contains only the C implementation of the wrapper and should be ready to be committed. Larry? |
Help me to understand here. You want to check in a patch adding 300 new lines of C code to the types module during beta, for a speed optimization, after we've already hit beta? While I like speedups as much as the next guy, I would be happier if this waited for 3.6. Of course, if Guido is overruling me, then Guido is overruling me and you get to check it in. |
Oh, wait, I was confusing myself. This is that new module you guys created for type hints, and this is a new object with no installed base. (Right?) Yeah, you can check it in for 3.5. |
No, you were right in your previous comment...
This speedup will mostly affect code compiled with Cython. See the following example: @asyncio.coroutine Cython will compile "coro" into a function, that returns a generator-like object. "asyncio.coroutine" will wrap this function, and, therefore, the optimized by Cython generator-like object will be wrapped too (to provide an __await__ method). This patch provides a faster wrapper for such generator-like objects. Since the whole point of using Cython is to squeeze as much performance as possible, I think it's essential to have this optimization in 3.5 (or at least in 3.5.1 as Guido suggested). It's a lot of C code, I agree. I only can say that I did my best to write very extensive unittests, and so I hope that it won't cause any trouble. |
This is not purely about speeding up the code. It's also about avoiding to replace the code object of a function, which is essentially a big and clumsy hack only to achieve setting a flag. Some tools, namely line_profiler, use the current code object as a dict key for state keeping. Replacing the reference in "__code__" might confuse them if they happened to catch a reference before. That's why I asked for applying at least the simple patch that sets the flag with a C level helper function. But I'd be ok with applying the latest patch as it is. The non-flag-setting parts are simple and a straight forward translation of the current Python code. |
Does this have a measurable performance impact? W.r.t. to profiling, the undecorated form will never be visible to any code other than the decorator, so won't show up in the profiler. |
Assuming that 1) it's the first and/or only decorator, 2) it's used to |
If type.coroutine is not the first and only decorator, then things may be even worse. Code objects are currently immutable. I think creating a copy is probably the best thing to. |
Closing this one now--there's no point in speeding up types.coroutine anymore. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: