Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose called function on frame object #57066

Open
ericsnowcurrently opened this issue Aug 30, 2011 · 19 comments
Open

Expose called function on frame object #57066

ericsnowcurrently opened this issue Aug 30, 2011 · 19 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

BPO 12857
Nosy @jcea, @ncoghlan, @benjaminp, @njsmith, @meadori, @durban, @markshannon, @ericsnowcurrently, @1st1, @jstasiak, @DimitrisJim
Files
  • called_function.diff
  • called_function_2.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericsnowcurrently'
    closed_at = None
    created_at = <Date 2011-08-30.05:54:27.258>
    labels = ['interpreter-core', 'type-feature']
    title = 'Expose called function on frame object'
    updated_at = <Date 2019-06-14.19:51:36.791>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2019-06-14.19:51:36.791>
    actor = 'zach.ware'
    assignee = 'eric.snow'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2011-08-30.05:54:27.258>
    creator = 'eric.snow'
    dependencies = []
    files = ['23071', '23149']
    hgrepos = []
    issue_num = 12857
    keywords = ['patch']
    message_count = 19.0
    messages = ['143201', '143248', '143249', '144008', '144064', '150280', '150282', '156961', '156962', '222158', '279846', '294295', '294412', '294430', '294431', '294432', '294433', '294436', '294475']
    nosy_count = 14.0
    nosy_names = ['jcea', 'ncoghlan', 'benjamin.peterson', 'Arfrever', 'njs', 'cvrebert', 'meador.inge', 'daniel.urban', 'Mark.Shannon', 'eric.snow', 'yselivanov', 'jstasiak', 'Jim Fasarakis-Hilliard', 'kevlevrone']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12857'
    versions = ['Python 3.5']

    @ericsnowcurrently
    Copy link
    Member Author

    This patch adds f_func to PyFrameObject and sets it for functions that get called (in PyFrame_New). For classes and modules it is set to None. The difference in performance was not noticable, as far as I could tell. However, I am willing to do more than just time 'make test' a few times if there is any concern.

    A couple weeks ago a thread on python-ideas centered on the subject matter of PEP-3130[1]. The discussion started with, and mainly involved, the idea of adding __function__ to the frame locals during execution of the function. __function__ would point to the function that was called, which had resulted in the frame.

    I spent quite a bit of time getting this to work using a closure, but the result was overkill. It also made it too easy to use __function__ for recursion, which Guido did not like (and I agree).

    At this point it dawned on me that it would be much simpler to just add the called function to the frame object. This patch is the result. In the end it is much more efficient than the locals approach I had been taking.

    [1] http://mail.python.org/pipermail/python-ideas/2011-August/011062.html

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Aug 30, 2011
    @ericsnowcurrently
    Copy link
    Member Author

    Thanks for the review, Nick. I'll be uploading a new patch in a couple hours with your recommended fixes.

    Regarding the comments on python-ideas, would it be better to use a weakref proxy around the function, to help with the reference cycles? That's what I was doing with my closure-based solution. I didn't do it here just to see what would happen and I didn't see any problems in my very limited testing (basically just 'make test'). I don't mind using weakrefs and, if it matters, I could pre-allocate the weakref proxy in PyFunction_New to save a little overhead at each call.

    For the moment I left in the code to limit f_func to only functions. I'll respond to that on python-ideas.

    Finally, how does this patch relate to the ABI? I'm not too familiar with it (read through PEP-384) and want to make sure I'm okay here.

    @ericsnowcurrently
    Copy link
    Member Author

    On second thought, I probably won't be able to get an updated patch tonight. I need to mull over the PyEval_EvalFunction implementation and the interaction with fast_function.

    @ericsnowcurrently
    Copy link
    Member Author

    Finally had a chance to get back to this. Here's a new patch in response to Nick's review.

    My only concern is the new _PyEval_EvalFunctionCode function. It is basically the existing PyEval_EvalCodeEx function with an extra parameter. I've turned PyEval_EvalCodeEx() into a very light wrapper around _PyEval_EvalFunctionCode(). This is how I tackled the backwards incompatibility of my previous patch. However, I am nervous about messing around with something like PyEval_EvalCodeEx().

    I was toying with the idea of doing a more comprehensive refactor involving call_function, fast_function, and do_call, which seems like it would be a better fix. However, I'm even more reticent to make that scale of changes, especially on something so critical, even if it seems right to me currently.

    I figured the safe thing for now would be to name the new function with a leading underscore to mark it as private.

    Here are the results from my cursory check before and after my latest patch:

    BEFORE
    3 tests failed:
    test_distutils test_packaging test_socket
    [293234 refs]
    real 14m40.578s
    user 11m43.547s
    sys 0m52.096s

    AFTER
    3 tests failed:
    test_distutils test_packaging test_socket
    [293171 refs]
    real 14m33.785s
    user 11m55.437s
    sys 0m53.850s

    @ericsnowcurrently
    Copy link
    Member Author

    Nick, does thing look better?

    @benjaminp
    Copy link
    Contributor

    What would you use the functionality provided by this patch for?

    @ericsnowcurrently
    Copy link
    Member Author

    My response to a similar query:

    What it enables is the ability to introspect the *actual* function object belonging to the currently executing code, whether in the same execution frame or in a later one on the stack. We don't have that now. The best we can do is guess, and I'm trying to refuse the temptation by building in an alternative that doesn't get in the way (and doesn't provide an attractive nuisance).

    My motivation for this patch was to be able to find out, in one of my functions, what function called it and to interact with that function object. The name or file represented in the code object on the stack didn't cut it; and decorators and function factories make the whole thing a mess. Guido sounded rather positive to the idea, which is why I pursued it. From Guido's post to which I linked in the issue[1]:

    > For me the use case involves determining what function called my
    > function. Currently you can tell in which execution frame a function
    > was called, and thereby which code object, but reliably matching that
    > to a function is not so simple. I recognize that my case is likely
    > not a general one.

    But it is a nice one. It solves some issues that pdb currently solves
    by just using a file/line reference.

    Not that it necessarily matters, but I could dig up the bit of code that motivated the whole thing. I recall it was something related to writing an enum library (a la flufl.enum). Anyway, I'd love to see this move forward.

    [1] http://mail.python.org/pipermail/python-ideas/2011-August/011062.html

    @ericsnowcurrently
    Copy link
    Member Author

    It's been a while and I probably need to clean up that last patch a little. I also need to address the questions of:

    1. f_func for class/module execution
    2. thread-safety

    see:
    http://mail.python.org/pipermail/python-ideas/2011-August/011321.html
    and
    http://mail.python.org/pipermail/python-ideas/2011-August/011064.html

    @ericsnowcurrently
    Copy link
    Member Author

    for reference, a related issue:

    http://bugs.python.org/issue13672

    @ericsnowcurrently ericsnowcurrently self-assigned this Jun 25, 2013
    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jul 3, 2014

    Note that bpo-13672 referred to in msg156962 refers to bpo-13855.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 1, 2016

    This topic came up in a discussion I was having with Yury, and thanks to generators and coroutines, I don't think "f_func" would be the right name for an attribute like this, with something more neutral like "f_origin" potentially being preferable

    The specific runtime debugging use case I have in mind is this:

    1. Given a frame reference, navigate to the generator-iterator or coroutine that spawned that frame
    2. Given gi_back and cr_back doubly-linked counterparts to the current gi_yieldfrom and cr_await introspection attributes on generators and coroutines, expose the relevant generator and/or coroutine stack in addition to the frame stack

    These secondary stacks could then potentially be incorporated into traceback outputs by default

    @njsmith
    Copy link
    Contributor

    njsmith commented May 23, 2017

    I'd also like to make use of this in trio, as a way to get safer and less complicated control-C handling without having to implement things in C. (Exhaustive discussion: https://vorpus.org/blog/control-c-handling-in-python-and-trio/)

    @nick: I understand your proposal is to add a field on the frame that for regular function calls points to the function object, and for generator/coroutines points to the generator/coroutine object. Is that right? Two possible concerns there: (a) this creates a reference cycle, because generator/coroutine objects own a reference to the frame object, (b) AFAICT you also currently can't get from a generator/coroutine object back to the function object, so this would break some (all?) of the use cases for this. I guess the solution to (a) is to make it a weak reference, and for (b) either keep f_func as always pointing to the function and make f_generator a separate field, or else make it f_origin and also add gi_func/co_func fields to generator/coroutine objects. (Which might be useful in any case.)

    Also, I'm not quite following the proposed use case. When a generator/coroutine is resumed, then send() steps through the whole yield stack and reconstructs the frame stack so it matches the generator/coroutine stack. (throw() currently doesn't do this, but that's a bug that breaks profilers/debuggers/etc - bpo-29590 - so we should fix it anyway, and I think Yury is planning to do that soon.) So if you get a frame from sys._getframe() or similar, then the stack is correct. And if a coroutine/generator isn't running, then the only way to get to the frame is by starting with the coroutine/generator object, so we don't really need a way to get back.

    Tracebacks are a different case again because they continue to hold frame references after the stack is unwound, but generally the traceback already has the correct stack because the exception propagates up the coroutine/generator stack, and also aren't the {gi,co}_frame references cleared as the exception propagates anyway?

    @markshannon
    Copy link
    Member

    I'm not too keen on this.

    Which frame is being executed is an implementation detail.
    For example, we currently push a new frame for list comprehensions, but that is an implementation detail. The language only specifies that list-comps execute in a new scope.

    I'd like to fix handling of ctrl-C, that Nathaniel mentions, properly in the compiler/interpreter.

    @njsmith
    Copy link
    Contributor

    njsmith commented May 25, 2017

    Certainly which frame is being executed is an implementation detail, and I can see an argument from that that we shouldn't have a frame introspection API at all... but we do have one, and it has some pretty important use cases, like traceback printing. Obviously most people shouldn't be touching this API at all, but if you do need it then it IMO should be as useful as possible.

    I don't see a better way to handle my (admittedly a bit weird) control-C case within the interpreter; I'd be interested to hear what you have in mind. But even if there is a nice way to fix control-C, then the proposal here would still be useful for e.g. giving better tracebacks.

    @1st1
    Copy link
    Member

    1st1 commented May 25, 2017

    I don't see a better way to handle my (admittedly a bit weird) control-C case within the interpreter

    I'm not sure I understand how f_func would help to better handle Control-C in Trio. Nathaniel, could you please elaborate on that?

    @njsmith
    Copy link
    Contributor

    njsmith commented May 25, 2017

    I'm not sure I understand how f_func would help to better handle Control-C in Trio. Nathaniel, could you please elaborate on that?

    Sure. The issue is that I need to mark certain frames as "protected" from KeyboardInterrupt, in a way that my signal handler can see when walking the frame stack, so it can decide whether to raise a KeyboardInterrupt immediately or to wait until a safer moment. Right now, this is done by injecting a magic local variable into the frame object, so the code is effectively like:

    def a_protected_function():
        _trio_keyboard_interrupt_protection_magic_local_ = True
        ... actual function ...

    But this has two problems: (1) it's not atomic: a signal could arrive in between when the frame is created and when the magic local gets set. (This is especially a problem for __(a)exit__ methods -- if bpo-29988 is fixed then it will guarantee that __(a)exit__ gets called, but this isn't much help if __(a)exit__ can instantly abort without doing any actual cleanup :-).) (2) The decorator code for setting this magic local ends up having to special-case functions/generators/async functions/async generators, so it's very complex and fragile: https://github.com/python-trio/trio/blob/1586818fbdd5c3468e15b64816db9257adae49c1/trio/_core/_ki.py#L109-L149
    (This is possibly the function that took the most tries to get right in all of trio, because there are serious subtleties and interactions with multiple CPython bugs.)

    I suspect the way I'm currently munging frames at runtime also annoys PyPy's JIT.

    OTOH if frames had f_func, then instead of having to munge frame.f_locals, I could set a magic attribute on the *functions* that are supposed to transition between protected/unprotected mode. Then I could delete all that nasty code and replace it with something like:

    def enable_ki_protection(func):
        func._trio_keyboard_interrupt_protection_enabled = True
        return func

    and in addition to being simpler, this would fix the race condition and improve performance too. (Note that now this decorator only runs at function definition time; I wouldn't have to do any tricky stuff at all during regular execution.)

    @1st1
    Copy link
    Member

    1st1 commented May 25, 2017

    Sure. The issue is that I need to mark certain frames as "protected" from KeyboardInterrupt, in a way that my signal handler can see when walking the frame stack, so it can decide whether to raise a KeyboardInterrupt immediately or to wait until a safer moment.

    OK, I got it, thanks!

    I suspect the way I'm currently munging frames at runtime also annoys PyPy's JIT.

    Yes, whenever you touch frames you're disabling the JIT for the call site (and maybe for more call sites up the stack, idk). So it doesn't matter what you use, f_func or f_locals, the performance will suffer big time. Is that acceptable for Trio?

    @njsmith
    Copy link
    Contributor

    njsmith commented May 25, 2017

    Yes, whenever you touch frames you're disabling the JIT for the call site (and maybe for more call sites up the stack, idk). So it doesn't matter what you use, f_func or f_locals, the performance will suffer big time. Is that acceptable for Trio?

    Ah, but that's the trick: with f_func the only time I have to touch frames is from my SIGINT handler, and its fine if *that* drops us out of the JIT (it probably has to in any case). The rest of the time, the interpreter would be automatically tracking all the state I need, so it's effectively free, and PyPy can do all its normal magic. The problem with using locals() / f_locals for this is that I have to constantly touch them during normal execution.

    (Probably we shouldn't make a huge deal out of the PyPy case though – for me the main advantages of f_func are the simplicity and atomicity. But FWIW I also just pinged Armin on #pypy and he said that PyPy wouldn't have a problem supporting f_func.)

    @ncoghlan
    Copy link
    Contributor

    Regarding Mark's point about the anonymous scopes created for things like list comprehensions, at least in CPython, those are actually full functions with generated names like "<listcomp>" (at module level) or "f.<locals>.<listcomp>" (at function level).

    From the point of view of code consuming the frame stack, they'd look pretty similar to a lambda expression, only with a different generated name.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants