Message338797
Somehow I missed that there's been an open issue on this. Like others I've written a bunch of different incarnations of an "elapsed" context manager over the years. Always for the more crude "how long did this take" reason like David mentioned, never the microbenchmarks scenario that timeit serves. The work is never quite substantial enough to make a Pypi package of it, but always annoying to have to do again the next time the need arises.
The overall least fiddly scheme I've found is to just use a callback. It's the simplest option:
@contextmanager
def elapsed(cb: Callable[[float, float], None], counter=time.perf_counter):
t0 = counter()
try:
yield
finally:
t1 = counter()
cb(t0, t1)
The simple case, the one most people want when they just want a quick check of elapsed time on a chunk of code, is then quite easy:
with elapsed(lambda t0, t1: print(f'read_excel: {t1 - t0:.2f} s')):
# very large spreadsheet
df = pandas.read_excel(filename, dtype=str)
(I rarely need to use a timer decorator for functions, because the profiler tracks function calls. But within the scope of a function it can sometimes be difficult to get timing information, particularly if the calls made there are into native extensions)
One of the consequences of using a callback strategy is that an additional version might be required for async callbacks (I have used these in production also):
@asynccontextmanager
async def aioelapsed(acb: Callable[[float, float], Awaitable[None]],
counter=time.perf_counter):
t0 = counter()
try:
yield
finally:
t1 = counter()
await acb(t0, t1)
So, the interesting thing here is that there is a general form for which an "elapsed" function is just a special case:
T = TypeVar('T')
@contextmanager
def sample_before_and_after(cb: Callable[[T, T], None], sample: Callable[[], T]):
before = sample()
try:
yield
finally:
after = sample()
cb(before, after)
The version of "elapsed" given further above is just this with kwarg sample=time.perf_counter. So, it might be sufficient to cover the use-case of an "elapsed" context manager instead with something like the above instead, which is more general. However, I don't actually have any use cases for this more general thing above, other than "elapsed", but I thought it was interesting.
Whether any of this merits being in the stdlib or not is hard to say. These code snippets are all short and easy to write. But I've written them multiple times to make "elapsed".
---
Once the "elapsed" abstraction is available, other cool ideas become a little bit easier to think about. These would be things that are user code (not be in the stdlib), but which can make use of the "elapsed" cm; for example, a clever logger for slow code blocks (written a few of these too):
@contextmanager
def slow_code_logging(logger_name, msg, *args, threshold_sec=1.0, **kwargs):
logger = logging.getLogger(logger_name)
if logger.isEnabledFor(logging.INFO):
def cb(t0: float, t1: float) -> None:
dt = t1 - t0
if dt < threshold_sec:
# Change the logger level depending on threshold
logger.debug(msg, dt, *args, **kwargs)
else:
logger.info(msg, dt, *args, **kwargs)
cm = elapsed(cb)
else:
# Logger is not even enabled, do nothing.
cm = nullcontext()
with cm:
yield
with slow_code_logging(__name__, 'Took longer to run than expected: %.4g s'):
...
And a super-hacky timing histogram generator (which would be quite interesting to use for measuring latency in network calls, e.g. with asyncio code):
@contextmanager
def histobuilder(counter, bin_size):
def cb(t0, t1):
dt = t1 - t0
bucket = dt - dt % bin_size
counter[bucket, bucket + bin_size] += 1
with elapsed(cb, counter=time.perf_counter_ns):
yield
counter = Counter()
for i in range(100):
with histobuilder(counter, bin_size=int(5e4)): # 50 us
time.sleep(0.01) # 10 ms
for (a, b), v in sorted(counter.items(), key=lambda _: _[0][0]):
print(f'{a/1e6:6.2f} ms - {b/1e6:>6.2f} ms: {v:4} ' + '\u2588' * v)
output:
9.85 ms - 9.90 ms: 1 █
9.90 ms - 9.95 ms: 10 ██████████
9.95 ms - 10.00 ms: 17 █████████████████
10.00 ms - 10.05 ms: 8 ████████
10.05 ms - 10.10 ms: 12 ████████████
10.10 ms - 10.15 ms: 5 █████
10.15 ms - 10.20 ms: 4 ████
10.20 ms - 10.25 ms: 4 ████
10.25 ms - 10.30 ms: 6 ██████
10.30 ms - 10.35 ms: 9 █████████
10.35 ms - 10.40 ms: 3 ███
10.40 ms - 10.45 ms: 5 █████
10.45 ms - 10.50 ms: 12 ████████████
10.50 ms - 10.55 ms: 3 ███
10.55 ms - 10.60 ms: 1 █
Things like the histogram builder above become (for me) much simpler when "elapsed" is just a thing that already exists even though it's very simple on its own.
If the right call is that we shouldn't add such a thing to the stdlib, should we instead just add an example of an elapsed context manager to the contextlib docs? Perhaps copy-paste from the docs is good enough for most users (like we do with the itertools recipes for example). I'd be willing to add something to the docs if it's decided that's the way to go.
(On the off chance some version of "elapsed" goes into the stdlib, I think contextlib is right place rather than timeit, particularly because of how the "sample_before_and_after" pattern generalises) |
|
Date |
User |
Action |
Args |
2019-03-25 13:46:36 | cjrh | set | recipients:
+ cjrh, tim.peters, georg.brandl, rhettinger, ncoghlan, giampaolo.rodola, rbcollins, ezio.melotti, steven.daprano, r.david.murray, THRlWiTi, martin.panter, dmoore, cheryl.sabella, xgdomingo |
2019-03-25 13:46:36 | cjrh | set | messageid: <1553521596.87.0.600186553687.issue19495@roundup.psfhosted.org> |
2019-03-25 13:46:36 | cjrh | link | issue19495 messages |
2019-03-25 13:46:36 | cjrh | create | |
|