classification
Title: asyncio: Optimize get_event_loop and _get_running_loop
Type: performance Stage: resolved
Components: asyncio Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mariatta, jimmylai, vstinner, yselivanov
Priority: normal Keywords:

Created on 2017-09-05 17:19 by jimmylai, last changed 2017-09-06 03:06 by Mariatta. This issue is now closed.

Files
File name Uploaded Description Edit
bench_asyncio.py vstinner, 2017-09-05 18:14
bench_get_event_loop.py vstinner, 2017-09-05 18:52
Pull Requests
URL Status Linked Edit
PR 3347 merged jimmylai, 2017-09-05 17:19
PR 3373 merged python-dev, 2017-09-06 00:37
Messages (11)
msg301342 - (view) Author: Jimmy Lai (jimmylai) * Date: 2017-09-05 17:19
get_event_loop() and _get_running_loop() can be faster.

Case	Time	Mean	Improve																						
No Change	7.323 +- 0.172	7.323	0.00%																						
Remove class _RunningLoop	6.513 +- 0.115	6.513	-11.06%																						
Expand _get_running_loop() inside get_event_loop()	5.851 +- 0.160	5.851	-20.10%																						
Use Tuple instead of two attributes	6.179 +- 0.099	6.179	-15.62%																						
Tuple + Remove _RunningLoop	6.026 +- 0.123	6.026	-17.71%																						
Tuple + return ternary + Remove _RunningLoop	6.060 +- 0.111	6.06	-17.25%																						
Combine all four optimizations	4.735 +- 0.111	4.735	-35.34%																						
Remove class _RunningLoop + Use Tuple instead of two attributes	6.241 +- 0.097	6.241	-14.78%		

Experimenting with different techniques to optimize get_event_loop and _get_running_loop.																	

After discuss with Yuri, decide not to expand _get_running_loop inside get_event_loop.
Combine tuple in _running_loop and Remove _RunningLoop (just use threading.local) can achieve the best improvement: 17.71% faster.
msg301343 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-05 17:28
Can you please provide the code of your benchmark?
msg301344 - (view) Author: Jimmy Lai (jimmylai) * Date: 2017-09-05 17:36
Benchmark script: Run 10 times to get mean and stdev


import asyncio
import time

async def async_get_loop():
    start_time = time.time()
    for _ in range(5000000):
        asyncio.get_event_loop()
    return time.time() - start_time

loop = asyncio.get_event_loop()
results = []
for _ in range(10):
    start_time = time.time()
    result = loop.run_until_complete(async_get_loop())
    results.append(result)

import statistics
print("elapse time: %.3lf +- %.3lf secs" % (statistics.mean(results), statistics.stdev(results)))
msg301348 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-05 18:14
I suggest to use my perf module to run benchmark, especially if the tested function takes less than 1 ms, which is the case here.

Attached benchmark script calls asyncio.get_event_loop(). Result on the master branch with PR 3347:

haypo@selma$ ./python ~/bench_asyncio.py --inherit=PYTHONPATH -o patch.json
haypo@selma$ ./python ~/bench_asyncio.py --inherit=PYTHONPATH -o ref.json
haypo@selma$ ./python -m perf compare_to ref.json patch.json 

Mean +- std dev: [ref] 881 ns +- 42 ns -> [patch] 859 ns +- 14 ns: 1.03x faster (-3%)

I'm not convinced that the PR is worth it. 3% is not interesting on a micro benchmark.

Or is there an issue in my benchmark?
msg301356 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-05 18:27
> I'm not convinced that the PR is worth it. 3% is not interesting on a micro benchmark.

I found a small issue in the PR (left a comment in the PR).

I think using a tuple is still a good idea (even if the speedup is tiny) because logically, both attributes on that threading.local() object are always set and read at the same time.  Essentially, it's a pair of (loop, pid), so using a tuple here makes the code easier to reason about.
msg301357 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-05 18:28
If the motivation is correctness and not performance, please adjust the issue and PR description :-)
msg301358 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-05 18:29
> Or is there an issue in my benchmark?

Yes.  The correct benchmark would be to measure `get_event_loop` performance from *within* a running event loop.
msg301359 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-05 18:58
According to Jimmy, asyncio.get_event_loop() behaves differently if it's called while an event loop is running. So my first benchmark was wrong.

Attached bench_get_event_loop.py measures asyncio.get_event_loop() performance when an event loop is running. I get a different result: 

haypo@selma$ ./python -m perf compare_to ref.json patch.json 
Mean +- std dev: [ref] 555 ns +- 11 ns -> [patch] 498 ns +- 11 ns: 1.11x faster (-10%)

Ok, now it's 10% faster :-)
msg301416 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-06 00:37
New changeset 80bbe6a7b67f33d0d0976bb8e3e5ba26b6b0e626 by Yury Selivanov (jimmylai) in branch 'master':
bpo-31350: Optimize get_event_loop and _get_running_loop (#3347)
https://github.com/python/cpython/commit/80bbe6a7b67f33d0d0976bb8e3e5ba26b6b0e626
msg301433 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-09-06 03:05
New changeset ff125e1aa9ee4eb928de79320a0e7c1b0c0f58f4 by Mariatta (Miss Islington (bot)) in branch '3.6':
bpo-31350: Optimize get_event_loop and _get_running_loop (GH-3347) (GH-3373)
https://github.com/python/cpython/commit/ff125e1aa9ee4eb928de79320a0e7c1b0c0f58f4
msg301434 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-09-06 03:06
This has been backported.
Thanks all :)
History
Date User Action Args
2017-09-06 03:06:58Mariattasetstatus: open -> closed
resolution: fixed
messages: + msg301434

stage: resolved
2017-09-06 03:05:37Mariattasetnosy: + Mariatta
messages: + msg301433
2017-09-06 00:37:09python-devsetpull_requests: + pull_request3384
2017-09-06 00:37:02yselivanovsetmessages: + msg301416
2017-09-05 18:58:35vstinnersetmessages: + msg301359
2017-09-05 18:52:53vstinnersetfiles: + bench_get_event_loop.py
2017-09-05 18:29:37yselivanovsetmessages: + msg301358
2017-09-05 18:28:26vstinnersetmessages: + msg301357
2017-09-05 18:27:25yselivanovsetmessages: + msg301356
2017-09-05 18:14:51vstinnersetfiles: + bench_asyncio.py

messages: + msg301348
2017-09-05 17:36:21jimmylaisetmessages: + msg301344
2017-09-05 17:29:50vstinnersettitle: Optimize get_event_loop and _get_running_loop -> asyncio: Optimize get_event_loop and _get_running_loop
2017-09-05 17:28:53vstinnersetnosy: + vstinner
messages: + msg301343
2017-09-05 17:20:15jimmylaisettype: performance
2017-09-05 17:19:35jimmylaicreate