This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author matthew.suozzo
Recipients matthew.suozzo
Date 2020-12-03.04:29:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1606969779.69.0.0674581941893.issue42551@roundup.psfhosted.org>
In-reply-to
Content
# Issue

When profiling a generator function, the initial call and all subsequent yields are aggregated into the same "ncalls" metric by cProfile.

## Example

>>> cProfile.run("""
... def foo():
...   yield 1
...   yield 2
... assert tuple(foo()) == (1, 2)
... """)
   <snip>
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      ...
        3    0.000    0.000    0.000    0.000 <string>:2(foo)
      ...

This was unexpected behavior since it *looks* like a single call from the code. This also complicates basic analysis about the frequency of a codepath's execution where a generator might yield a variable number of times depending on the input.

The profile interface can and does differentiate between call types: Normal calls and "primitive" calls, i.e. those that are not induced via recursion, are displayed as "all_calls/primitive_calls" e.g. "3/1" for a single initial calls with 2 recursed calls (comparable to the example above).

This seems like an appropriate abstraction to apply in the generator case: Each yield is better modeled as an 'interior' call to the generator, not as a call on its own.

# Possible fix

I have two ideas that seem like they might address the problem:

* Add a new PyTrace_YIELD constant (and handle it in [3])
* Track some state from the `frame->f_gen` in the ProfilerEntry (injected in [3], used in [4]) to determine whether this is the first or a subsequent call.

I've not been poking around for long so I don't have an intuition about which would be less invasive (nor really whether either would work in practice).

As background, this seems to be the call chain from trace invocation to callcount increment:
[0]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4106-L4107
[1]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4937
[2]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Python/ceval.c#L4961
[3]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L419
[4]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L389
[5]: https://github.com/python/cpython/blob/4e7a69bdb63a104587759d7784124492dcdd496e/Modules/_lsprof.c#L311-L316
History
Date User Action Args
2020-12-03 04:29:39matthew.suozzosetrecipients: + matthew.suozzo
2020-12-03 04:29:39matthew.suozzosetmessageid: <1606969779.69.0.0674581941893.issue42551@roundup.psfhosted.org>
2020-12-03 04:29:39matthew.suozzolinkissue42551 messages
2020-12-03 04:29:38matthew.suozzocreate