New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in ThreadPoolExecutor + run_in_executor #82611
Comments
I have run into a memory leak caused by using run_in_executor + ThreadPoolExecutor while running some stability tests with custom web services. It was 1 MB leaked for 1k requests made for my case and I've extracted the root cause and converted it into minimal script with both mentioned parts + just NOP function to "run". The script can easily eat up to 1 GB of memory in less then 1 minute now. It uses external psutil library to report the memory allocated but it can be easily commented out and the leak will stay anyway. One can found that script attached + Dockerfile/Makefile for reproducibility. I've also reproduced it in my own conda-based 3.7 environment as well as the master branch of cpython. |
Well, that's a common issue when using asyncio: you forgot await. async def main(_loop):
while True:
with futures.ThreadPoolExecutor() as pool:
_loop.run_in_executor(pool, nop)
sys.stdout.write(f'\r{get_mem():0.3f}MB') It should be: "await _loop.run_in_executor(pool, nop)" ;-) Sadly, PYTHONASYNCIODEBUG=1 env var doesn't complain on this bug. See: https://docs.python.org/dev/library/asyncio-dev.html#debug-mode |
You MUST await a future returned from Yuri, what should we do with the issue? I see the second similar report in the last half of the year.
|
Victor answered the first :) |
A Task emits a warning when it's not awaited. Can a Task be used instead of a Future in run_in_executor()? |
I don't think that the task is required here. The problem is that run_in_executor is a function that returns asyncio future; that is in turn a wrapper around concurrent future object. If we convert run_in_executor() into async function we'll get a warning about unawaited coroutine even without asyncio debug mode. |
That sounds like a good solution :-) |
The change is slightly not backward compatible but
|
Oh, I see that in the initial code with leakage (it was heavy ThreadPoolExecutor + xgboost thing) there was an await but I must have lost it somewhere while reducing it to the minimal example and finished in the wrong direction. Glad too see that it raised a discussion to prevent others from getting into this silent trap. |
Yeah, that's my main problem with converting If we convert it to a coroutine a lot of code will break, which might be OK if it's really necessary. Is it though? Can we return a special Future subclass that complains if it's not awaited? Would that fix the problem? |
I thought about returning a special subclass. What's about returning the proxy object with future instance embedded; the object raises DeprecationWarning for everythin except __repr__, __del__ and __await__, __getattr__ redirects to getattr(self._fut, name) for all other attributes access. It is a more complex solution but definitely 100% backward compatible; plus the solution we can prepare people for removing the deprecated code eventually. |
Yeah. Do you think it's worth it bothering with this old low-level API instead of making a new high-level one? I don't, but if you do the feel free to change it. |
The API exists, people use it and get the memory leak. |
I don't understand. What happens if we don't await the future that run_in_executor returns? Does it get GCed eventually? Why is memory leaking? |
Any solutions? I'm having this too in python 2.7.18... |
This issue is about asyncio which was added to Python 3.4. Moreover, Python 2.7 is no longer supported: you should upgrade to Python 3.11. |
Duplicate of #85865 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: