Author gvanrossum
Recipients Brunsgaard, gvanrossum, vstinner, yselivanov
Date 2016-02-01.16:35:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CAP7+vJ+iCHLVJBbHSEpqiy9ipvmHQ=GDY83WeLEdA5k2=Gmcmw@mail.gmail.com>
In-reply-to <1454342780.58.0.210365622359.issue26259@psf.upfronthosting.co.za>
Content
I think this is expected behavior when you do a lot of get() calls
that are cancelled by a timeout. Each call leaves a cancelled Future
in the deque self._getters; these are removed only when _wakeup_next()
is called (by put or put_nowait).

Why is your app doing this?

I suppose we could band-aid this by removing cancelled Futures when
next calling get(), but that won't help if you start a load of get()
calls and then cancel them all.

Note that I don't want to complicate this code -- in the past it has
been very difficult to prove it was correct under all circumstances,
and now I am sure it is correct (barring memory leaks, as you've
shown). I want to make sure any fix for this issue doesn't complicate
the code, making it harder to understand or prove its correctness.

On Mon, Feb 1, 2016 at 8:06 AM, Jonas Brunsgaard <report@bugs.python.org> wrote:
>
> New submission from Jonas Brunsgaard:
>
> When making repeated calls to queue.get, memory is building up and is not freed until queue.push is called.
>
> I wrote this little program to show my findings. The program will perform a lot of calls to queue.get and once every 60 seconds a queue.push is performed. Every 15 seconds the memory usage of dictionaries is printet to the console. You can find the output below the program
>
> ```
> import asyncio
> from pympler import muppy
> from pympler import summary
>
>
> q = asyncio.Queue()
> loop = asyncio.get_event_loop()
> closing = False
>
>
> async def get_with_timeout():
>     while not closing:
>         try:
>             task = asyncio.ensure_future(q.get())
>             await asyncio.wait_for(task, 0.2)
>         except asyncio.TimeoutError:
>             pass
>
> def mem_profiling():
>     if not closing:
>         types_ = muppy.filter(muppy.get_objects(), Type=dict)
>         summary.print_(summary.summarize(types_))
>         loop.call_later(15, mem_profiling)
>
> def put():
>     q.put_nowait(None)
>     loop.call_later(60, put)
>
> put()
> tasks = [asyncio.ensure_future(get_with_timeout()) for _ in range(10000)]
> mem_profiling()
>
> try:
>     loop.run_forever()
> except KeyboardInterrupt:
>     closing = True
>     loop.run_until_complete(
>         asyncio.ensure_future(asyncio.wait(tasks)))
> finally:
>     loop.close()
> ```
>
> Output:
>
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |       11886 |      3.20 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      224835 |     62.44 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      401487 |    111.00 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      576621 |    158.86 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      219745 |     58.99 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      390061 |    105.72 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      560623 |    152.65 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>                             <class 'dict |      220372 |     59.11 MB
>          <class 'collections.OrderedDict |          11 |     16.84 KB
>   <class 'email._encoded_words._QByteMap |           1 |    288     B
>                                    types |   # objects |   total size
> ======================================== | =========== | ============
>
> What we see is that memory builds up ~3mb/s, and when the push method is called the memory usage returns to normal.
>
> Is this the expected behavior or is this a bug? If it is expected I think we should update the documentation, to let people know about this behavior.
>
> --
> Jonas Brunsgaard
>
> ----------
> components: asyncio
> files: poc.py
> messages: 259327
> nosy: Jonas Brunsgaard, gvanrossum, haypo, yselivanov
> priority: normal
> severity: normal
> status: open
> title: Memleak when repeated calls to asyncio.queue.Queue.get is performed, without push to queue.
> type: resource usage
> versions: Python 3.5
> Added file: http://bugs.python.org/file41772/poc.py
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue26259>
> _______________________________________
History
Date User Action Args
2016-02-01 16:35:23gvanrossumsetrecipients: + gvanrossum, vstinner, yselivanov, Brunsgaard
2016-02-01 16:35:23gvanrossumlinkissue26259 messages
2016-02-01 16:35:23gvanrossumcreate