Issue12857
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011-08-30 05:54 by eric.snow, last changed 2022-04-11 14:57 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
called_function.diff | eric.snow, 2011-08-30 05:54 | review | ||
called_function_2.diff | eric.snow, 2011-09-14 05:03 | review |
Messages (19) | |||
---|---|---|---|
msg143201 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-08-30 05:54 | |
This patch adds f_func to PyFrameObject and sets it for functions that get called (in PyFrame_New). For classes and modules it is set to None. The difference in performance was not noticable, as far as I could tell. However, I am willing to do more than just time 'make test' a few times if there is any concern. A couple weeks ago a thread on python-ideas centered on the subject matter of PEP 3130[1]. The discussion started with, and mainly involved, the idea of adding __function__ to the frame locals during execution of the function. __function__ would point to the function that was called, which had resulted in the frame. I spent quite a bit of time getting this to work using a closure, but the result was overkill. It also made it too easy to use __function__ for recursion, which Guido did not like (and I agree). At this point it dawned on me that it would be much simpler to just add the called function to the frame object. This patch is the result. In the end it is much more efficient than the locals approach I had been taking. [1] http://mail.python.org/pipermail/python-ideas/2011-August/011062.html |
|||
msg143248 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-08-31 04:43 | |
Thanks for the review, Nick. I'll be uploading a new patch in a couple hours with your recommended fixes. Regarding the comments on python-ideas, would it be better to use a weakref proxy around the function, to help with the reference cycles? That's what I was doing with my closure-based solution. I didn't do it here just to see what would happen and I didn't see any problems in my very limited testing (basically just 'make test'). I don't mind using weakrefs and, if it matters, I could pre-allocate the weakref proxy in PyFunction_New to save a little overhead at each call. For the moment I left in the code to limit f_func to only functions. I'll respond to that on python-ideas. Finally, how does this patch relate to the ABI? I'm not too familiar with it (read through PEP 384) and want to make sure I'm okay here. |
|||
msg143249 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-08-31 05:47 | |
On second thought, I probably won't be able to get an updated patch tonight. I need to mull over the PyEval_EvalFunction implementation and the interaction with fast_function. |
|||
msg144008 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-09-14 05:03 | |
Finally had a chance to get back to this. Here's a new patch in response to Nick's review. My only concern is the new _PyEval_EvalFunctionCode function. It is basically the existing PyEval_EvalCodeEx function with an extra parameter. I've turned PyEval_EvalCodeEx() into a very light wrapper around _PyEval_EvalFunctionCode(). This is how I tackled the backwards incompatibility of my previous patch. However, I am nervous about messing around with something like PyEval_EvalCodeEx(). I was toying with the idea of doing a more comprehensive refactor involving call_function, fast_function, and do_call, which seems like it would be a better fix. However, I'm even more reticent to make that scale of changes, especially on something so critical, even if it seems right to me currently. I figured the safe thing for now would be to name the new function with a leading underscore to mark it as private. Here are the results from my cursory check before and after my latest patch: BEFORE 3 tests failed: test_distutils test_packaging test_socket [293234 refs] real 14m40.578s user 11m43.547s sys 0m52.096s AFTER 3 tests failed: test_distutils test_packaging test_socket [293171 refs] real 14m33.785s user 11m55.437s sys 0m53.850s |
|||
msg144064 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-09-15 05:38 | |
Nick, does thing look better? |
|||
msg150280 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2011-12-28 04:23 | |
What would you use the functionality provided by this patch for? |
|||
msg150282 - (view) | Author: Eric Snow (eric.snow) * | Date: 2011-12-28 06:30 | |
My response to a similar query: What it enables is the ability to introspect the *actual* function object belonging to the currently executing code, whether in the same execution frame or in a later one on the stack. We don't have that now. The best we can do is guess, and I'm trying to refuse the temptation by building in an alternative that doesn't get in the way (and doesn't provide an attractive nuisance). My motivation for this patch was to be able to find out, in one of my functions, what function called it and to interact with that function object. The name or file represented in the code object on the stack didn't cut it; and decorators and function factories make the whole thing a mess. Guido sounded rather positive to the idea, which is why I pursued it. From Guido's post to which I linked in the issue[1]: > For me the use case involves determining what function called my > function. Currently you can tell in which execution frame a function > was called, and thereby which code object, but reliably matching that > to a function is not so simple. I recognize that my case is likely > not a general one. But it is a nice one. It solves some issues that pdb currently solves by just using a file/line reference. Not that it necessarily matters, but I could dig up the bit of code that motivated the whole thing. I recall it was something related to writing an enum library (a la flufl.enum). Anyway, I'd love to see this move forward. [1] http://mail.python.org/pipermail/python-ideas/2011-August/011062.html |
|||
msg156961 - (view) | Author: Eric Snow (eric.snow) * | Date: 2012-03-28 01:36 | |
It's been a while and I probably need to clean up that last patch a little. I also need to address the questions of: 1. f_func for class/module execution 2. thread-safety see: http://mail.python.org/pipermail/python-ideas/2011-August/011321.html and http://mail.python.org/pipermail/python-ideas/2011-August/011064.html |
|||
msg156962 - (view) | Author: Eric Snow (eric.snow) * | Date: 2012-03-28 01:48 | |
for reference, a related issue: http://bugs.python.org/issue13672 |
|||
msg222158 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014-07-03 08:28 | |
Note that #13672 referred to in msg156962 refers to #13855. |
|||
msg279846 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2016-11-01 01:29 | |
This topic came up in a discussion I was having with Yury, and thanks to generators and coroutines, I don't think "f_func" would be the right name for an attribute like this, with something more neutral like "f_origin" potentially being preferable The specific runtime debugging use case I have in mind is this: 1. Given a frame reference, navigate to the generator-iterator or coroutine that spawned that frame 2. Given gi_back and cr_back doubly-linked counterparts to the current gi_yieldfrom and cr_await introspection attributes on generators and coroutines, expose the relevant generator and/or coroutine stack in addition to the frame stack These secondary stacks could then potentially be incorporated into traceback outputs by default |
|||
msg294295 - (view) | Author: Nathaniel Smith (njs) * | Date: 2017-05-23 23:26 | |
I'd also like to make use of this in trio, as a way to get safer and less complicated control-C handling without having to implement things in C. (Exhaustive discussion: https://vorpus.org/blog/control-c-handling-in-python-and-trio/) @Nick: I understand your proposal is to add a field on the frame that for regular function calls points to the function object, and for generator/coroutines points to the generator/coroutine object. Is that right? Two possible concerns there: (a) this creates a reference cycle, because generator/coroutine objects own a reference to the frame object, (b) AFAICT you also currently can't get from a generator/coroutine object back to the function object, so this would break some (all?) of the use cases for this. I guess the solution to (a) is to make it a weak reference, and for (b) either keep f_func as always pointing to the function and make f_generator a separate field, or else make it f_origin and also add gi_func/co_func fields to generator/coroutine objects. (Which might be useful in any case.) Also, I'm not quite following the proposed use case. When a generator/coroutine is resumed, then send() steps through the whole yield stack and reconstructs the frame stack so it matches the generator/coroutine stack. (throw() currently doesn't do this, but that's a bug that breaks profilers/debuggers/etc - bpo-29590 - so we should fix it anyway, and I think Yury is planning to do that soon.) So if you get a frame from sys._getframe() or similar, then the stack is correct. And if a coroutine/generator isn't running, then the only way to get to the frame is by starting with the coroutine/generator object, so we don't really need a way to get back. Tracebacks are a different case again because they continue to hold frame references after the stack is unwound, but generally the traceback already has the correct stack because the exception propagates up the coroutine/generator stack, and also aren't the {gi,co}_frame references cleared as the exception propagates anyway? |
|||
msg294412 - (view) | Author: Mark Shannon (Mark.Shannon) * | Date: 2017-05-24 23:07 | |
I'm not too keen on this. Which frame is being executed is an implementation detail. For example, we currently push a new frame for list comprehensions, but that is an implementation detail. The language only specifies that list-comps execute in a new scope. I'd like to fix handling of ctrl-C, that Nathaniel mentions, properly in the compiler/interpreter. |
|||
msg294430 - (view) | Author: Nathaniel Smith (njs) * | Date: 2017-05-25 04:34 | |
Certainly which frame is being executed is an implementation detail, and I can see an argument from that that we shouldn't have a frame introspection API at all... but we do have one, and it has some pretty important use cases, like traceback printing. Obviously most people shouldn't be touching this API at all, but if you do need it then it IMO should be as useful as possible. I don't see a better way to handle my (admittedly a bit weird) control-C case within the interpreter; I'd be interested to hear what you have in mind. But even if there is a nice way to fix control-C, then the proposal here would still be useful for e.g. giving better tracebacks. |
|||
msg294431 - (view) | Author: Yury Selivanov (yselivanov) * | Date: 2017-05-25 04:39 | |
> I don't see a better way to handle my (admittedly a bit weird) control-C case within the interpreter I'm not sure I understand how `f_func` would help to better handle Control-C in Trio. Nathaniel, could you please elaborate on that? |
|||
msg294432 - (view) | Author: Nathaniel Smith (njs) * | Date: 2017-05-25 05:11 | |
> I'm not sure I understand how `f_func` would help to better handle Control-C in Trio. Nathaniel, could you please elaborate on that? Sure. The issue is that I need to mark certain frames as "protected" from KeyboardInterrupt, in a way that my signal handler can see when walking the frame stack, so it can decide whether to raise a KeyboardInterrupt immediately or to wait until a safer moment. Right now, this is done by injecting a magic local variable into the frame object, so the code is effectively like: def a_protected_function(): _trio_keyboard_interrupt_protection_magic_local_ = True ... actual function ... But this has two problems: (1) it's not atomic: a signal could arrive in between when the frame is created and when the magic local gets set. (This is especially a problem for __(a)exit__ methods -- if bpo-29988 is fixed then it will guarantee that __(a)exit__ gets called, but this isn't much help if __(a)exit__ can instantly abort without doing any actual cleanup :-).) (2) The decorator code for setting this magic local ends up having to special-case functions/generators/async functions/async generators, so it's very complex and fragile: https://github.com/python-trio/trio/blob/1586818fbdd5c3468e15b64816db9257adae49c1/trio/_core/_ki.py#L109-L149 (This is possibly the function that took the most tries to get right in all of trio, because there are serious subtleties and interactions with multiple CPython bugs.) I suspect the way I'm currently munging frames at runtime also annoys PyPy's JIT. OTOH if frames had f_func, then instead of having to munge frame.f_locals, I could set a magic attribute on the *functions* that are supposed to transition between protected/unprotected mode. Then I could delete all that nasty code and replace it with something like: def enable_ki_protection(func): func._trio_keyboard_interrupt_protection_enabled = True return func and in addition to being simpler, this would fix the race condition and improve performance too. (Note that now this decorator only runs at function definition time; I wouldn't have to do any tricky stuff at all during regular execution.) |
|||
msg294433 - (view) | Author: Yury Selivanov (yselivanov) * | Date: 2017-05-25 05:17 | |
> Sure. The issue is that I need to mark certain frames as "protected" from KeyboardInterrupt, in a way that my signal handler can see when walking the frame stack, so it can decide whether to raise a KeyboardInterrupt immediately or to wait until a safer moment. OK, I got it, thanks! > I suspect the way I'm currently munging frames at runtime also annoys PyPy's JIT. Yes, whenever you touch frames you're disabling the JIT for the call site (and maybe for more call sites up the stack, idk). So it doesn't matter what you use, `f_func` or `f_locals`, the performance will suffer big time. Is that acceptable for Trio? |
|||
msg294436 - (view) | Author: Nathaniel Smith (njs) * | Date: 2017-05-25 06:10 | |
> Yes, whenever you touch frames you're disabling the JIT for the call site (and maybe for more call sites up the stack, idk). So it doesn't matter what you use, `f_func` or `f_locals`, the performance will suffer big time. Is that acceptable for Trio? Ah, but that's the trick: with f_func the only time I have to touch frames is from my SIGINT handler, and its fine if *that* drops us out of the JIT (it probably has to in any case). The rest of the time, the interpreter would be automatically tracking all the state I need, so it's effectively free, and PyPy can do all its normal magic. The problem with using locals() / f_locals for this is that I have to constantly touch them during normal execution. (Probably we shouldn't make a huge deal out of the PyPy case though – for me the main advantages of f_func are the simplicity and atomicity. But FWIW I also just pinged Armin on #pypy and he said that PyPy wouldn't have a problem supporting f_func.) |
|||
msg294475 - (view) | Author: Nick Coghlan (ncoghlan) * | Date: 2017-05-25 12:39 | |
Regarding Mark's point about the anonymous scopes created for things like list comprehensions, at least in CPython, those are actually full functions with generated names like "<listcomp>" (at module level) or "f.<locals>.<listcomp>" (at function level). From the point of view of code consuming the frame stack, they'd look pretty similar to a lambda expression, only with a different generated name. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:57:21 | admin | set | github: 57066 |
2019-06-14 19:51:36 | zach.ware | set | messages: - msg345592 |
2019-06-14 14:43:48 | eric.snow | set | files: - 122.pdf |
2019-06-14 12:43:45 | kevlevrone | set | files:
+ 122.pdf nosy: + kevlevrone messages: + msg345592 |
2017-07-18 08:22:42 | jstasiak | set | nosy:
+ jstasiak |
2017-05-25 12:39:19 | ncoghlan | set | messages: + msg294475 |
2017-05-25 06:10:26 | njs | set | messages: + msg294436 |
2017-05-25 05:17:03 | yselivanov | set | messages: + msg294433 |
2017-05-25 05:11:12 | njs | set | messages: + msg294432 |
2017-05-25 04:39:51 | yselivanov | set | messages: + msg294431 |
2017-05-25 04:34:12 | njs | set | messages: + msg294430 |
2017-05-24 23:07:18 | Mark.Shannon | set | nosy:
+ Mark.Shannon messages: + msg294412 |
2017-05-23 23:26:15 | njs | set | nosy:
+ njs messages: + msg294295 |
2017-05-04 15:40:24 | Jim Fasarakis-Hilliard | set | nosy:
+ Jim Fasarakis-Hilliard |
2016-11-01 02:45:26 | BreamoreBoy | set | nosy:
- BreamoreBoy |
2016-11-01 01:29:20 | ncoghlan | set | nosy:
+ yselivanov messages: + msg279846 |
2014-07-03 08:28:15 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg222158 versions: + Python 3.5, - Python 3.3 |
2013-06-25 05:29:09 | eric.snow | set | assignee: eric.snow |
2012-03-28 01:48:31 | eric.snow | set | messages: + msg156962 |
2012-03-28 01:36:25 | eric.snow | set | messages: + msg156961 |
2011-12-28 22:37:19 | jcea | set | nosy:
+ jcea |
2011-12-28 20:35:04 | Arfrever | set | nosy:
+ Arfrever |
2011-12-28 06:30:47 | eric.snow | set | messages: + msg150282 |
2011-12-28 04:23:20 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg150280 |
2011-12-27 23:34:34 | meador.inge | set | nosy:
+ meador.inge |
2011-12-27 19:02:59 | cvrebert | set | nosy:
+ cvrebert |
2011-09-15 05:38:07 | eric.snow | set | nosy:
+ ncoghlan messages: + msg144064 |
2011-09-14 05:03:21 | eric.snow | set | files:
+ called_function_2.diff messages: + msg144008 |
2011-08-31 05:47:14 | eric.snow | set | messages: + msg143249 |
2011-08-31 04:43:51 | eric.snow | set | messages: + msg143248 |
2011-08-30 09:49:03 | daniel.urban | set | nosy:
+ daniel.urban |
2011-08-30 05:54:27 | eric.snow | create |