This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: memory leak: no transport for pipes by create_subprocess_exec/shell
Type: Stage: resolved
Components: asyncio Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gvanrossum, koobs, python-dev, vstinner, wabu, yselivanov
Priority: normal Keywords: patch

Created on 2014-10-21 12:33 by wabu, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
fix22685.patch gvanrossum, 2014-10-21 20:16
set_transport.patch vstinner, 2014-10-21 20:58 review
test_pause_reading.py vstinner, 2014-10-21 21:07
set_transport-2.patch vstinner, 2014-11-05 15:24 review
koobs-freebsd9-python3x-build-2405.log koobs, 2014-11-26 04:08
koobs-freebsd10-python34-build-598.log koobs, 2014-11-26 04:08
Messages (28)
msg229763 - (view) Author: wabu (wabu) Date: 2014-10-21 12:33
using `p = create_subprocess_exec(..., stdout=subprocess.PIPE, limit=...)`, 
p.stdout has not transport set, so the underlying protocol is unable to pause the reading of the transport, resulting in high memory usage when slowly consuming input from p.stdout, even if the limit parameter is passed. 

A workaround is to set the transport manually after creating the subprocess:
`p.stdout.set_transport(p._transport.get_pipe_transport(1))`, but this should happen inside the create_subprocess call.
msg229767 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-21 16:21
> p = create_subprocess_exec(..., stdout=subprocess.PIPE, limit=...)

I don't see "yield from". Do you wait until the coroutine completes?

See the documentation:
https://docs.python.org/dev/library/asyncio-subprocess.html#subprocess-using-streams

Please write a more complete example to demontrate your issue.
msg229778 - (view) Author: wabu (wabu) Date: 2014-10-21 19:55
Sorry for the confusion, yes i do the yield from. The stdout stream for the process is actually producing data as it should. The subprocess produces a high amount of data (pbzip2), but is only consumed slowly. 

Normally when the buffer limit is reached for a stream reader, it calls pause_reading on the transport inside the feed_data method (see https://code.google.com/p/tulip/source/browse/asyncio/streams.py#365),
but here this is not happening, as the returned reader has no transport set (p.stdout._transport == None). So it fills up all the memory.
msg229779 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-10-21 20:07
Hm... It does look like there's nothing that tells stdout (which is a
StreamReader) about its transport. Wabu, could you experiment with a change
to asyncio/subprocess.py where SubprocessStreamProtocol.connection_made()
calls self.stdout.set_transport(transport) right after creating self.stdout?

On Tue, Oct 21, 2014 at 12:55 PM, wabu <report@bugs.python.org> wrote:

>
> wabu added the comment:
>
> Sorry for the confusion, yes i do the yield from. The stdout stream for
> the process is actually producing data as it should. The subprocess
> produces a high amount of data (pbzip2), but is only consumed slowly.
>
> Normally when the buffer limit is reached for a stream reader, it calls
> pause_reading on the transport inside the feed_data method (see
> https://code.google.com/p/tulip/source/browse/asyncio/streams.py#365),
> but here this is not happening, as the returned reader has no transport
> set (p.stdout._transport == None). So it fills up all the memory.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22685>
> _______________________________________
>
msg229781 - (view) Author: wabu (wabu) Date: 2014-10-21 20:12
Here's a more complete example

@coroutine
put_data(filename, queue, chunksize=16000):
    pbzip2 = yield from asyncio.create_subprocess_exec(
                 'pbzip2', '-cd', filename, 
                 stdout=asyncio.subprocess.PIPE, limit=self.chunksize*2)

    while not pbzip2.stdout.at_eof():
        data = yield from pbzip2.stdout.read(chunksize)
        yield from queue.put(data)

adding the workaround after createing the stream fixes the issue:
pbzip2.stdout.set_transport(pbzip2._transport.get_pipe_transport(1))
msg229782 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-10-21 20:16
Can you confirm that this patch fixes the problem (without you needing the workaround in your own code)?
msg229784 - (view) Author: wabu (wabu) Date: 2014-10-21 20:39
thanks a lot, the fix works!

On 21.10.2014 22:16, Guido van Rossum wrote:
> 
> Guido van Rossum added the comment:
> 
> Can you confirm that this patch fixes the problem (without you needing the workaround in your own code)?
> 
> ----------
> keywords: +patch
> Added file: http://bugs.python.org/file36989/fix22685.patch
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22685>
> _______________________________________
>
msg229786 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-10-21 20:41
Victor, do you think this needs a unittest? It seems kind of difficult to
test for whether memory fills up (the machine may get wedged if it does :-).

On Tue, Oct 21, 2014 at 1:39 PM, wabu <report@bugs.python.org> wrote:

>
> wabu added the comment:
>
> thanks a lot, the fix works!
>
> On 21.10.2014 22:16, Guido van Rossum wrote:
> >
> > Guido van Rossum added the comment:
> >
> > Can you confirm that this patch fixes the problem (without you needing
> the workaround in your own code)?
> >
> > ----------
> > keywords: +patch
> > Added file: http://bugs.python.org/file36989/fix22685.patch
> >
> > _______________________________________
> > Python tracker <report@bugs.python.org>
> > <http://bugs.python.org/issue22685>
> > _______________________________________
> >
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22685>
> _______________________________________
>
msg229787 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-21 20:58
Oh ok, now I understood the issue :-)

Attached patch should fix it. It implements Guido's suggestion. There is not unit test yet.
msg229789 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-21 21:07
test_pause_reading.py: manual test for pause reading. It should pause reading stdout pipe 4 times on UNIX.
msg229790 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-21 21:10
Ok, I just missed completly the Guido posted a patch before me... Maybe we wrote the patch at the same time, I didn't refresh my browser :-)

> Victor, do you think this needs a unittest? It seems kind of difficult to
test for whether memory fills up (the machine may get wedged if it does :-).

Since the pause/resume thing is very tricky, it would be much better to have a unit test. We can probably mock a lot of things, we just need to check that the stream reader calls transport.pause_reading() when feed_data() is called with "too much" data.
msg229791 - (view) Author: wabu (wabu) Date: 2014-10-21 21:14
On 21.10.2014 22:41, Guido van Rossum wrote:
> Guido van Rossum added the comment:
> 
> Victor, do you think this needs a unittest? It seems kind of difficult to
> test for whether memory fills up (the machine may get wedged if it does :-).

You could setup a the subprocess with
asyncio.async(asyncio.create_subprocess_exec(...)) and then let the
asyncio loop run for a limited time with
loop.run_until_complete(asyncio.sleep(.1)), watching carefully for
higher memory usage after each sleep.

But still it's difficult to create a reliable unit-test with this ...
msg229793 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-10-21 21:35
Victor, your fix is identical to mine except for the variable names. I like your version better. :-)

So now it's just about the unittest.
msg230685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-05 15:24
Here is a patch with a simple unit test.

I chose to modify the pause_reading method of the transport instead of mocking "everything" to test the real code (have a better code coverage).
msg231662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-25 14:37
> Here is a patch with a simple unit test.

Can someone review it? If not, I will commit it without review.
msg231666 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2014-11-25 15:37
Sorry, looks good.

On Tue, Nov 25, 2014 at 6:37 AM, STINNER Victor <report@bugs.python.org>
wrote:

>
> STINNER Victor added the comment:
>
> > Here is a patch with a simple unit test.
>
> Can someone review it? If not, I will commit it without review.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue22685>
> _______________________________________
>
msg231667 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-11-25 16:25
New changeset f75d40c02f0a by Victor Stinner in branch '3.4':
Closes #22685, asyncio: Set the transport of stdout and stderr StreamReader
https://hg.python.org/cpython/rev/f75d40c02f0a

New changeset 7da2288183d1 by Victor Stinner in branch 'default':
(Merge 3.4) Closes #22685, asyncio: Set the transport of stdout and stderr
https://hg.python.org/cpython/rev/7da2288183d1
msg231668 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-25 16:25
Fix also pushed to Python 3.4, 3.5 and to Tulip. Thanks for the report wabu.

Tulip commit:

changeset:   1350:c3a9d355eb34
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Tue Nov 25 17:17:13 2014 +0100
files:       asyncio/subprocess.py tests/test_subprocess.py
description:
Python issue #22685: Set the transport of stdout and stderr StreamReader
objects in the SubprocessStreamProtocol. It allows to pause the transport to
not buffer too much stdout or stderr data.
msg231678 - (view) Author: wabu (wabu) Date: 2014-11-25 18:47
thanks for the fixes 'n' integration
msg231685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-25 21:01
You can workaround the issue by setting manually the transport on the StreamReader objects. Tell me if you need help to write the workaround if you cannot wait for a release containing the fix.
msg231691 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2014-11-26 04:08
Buildbot failures observed on koobs-freebsd9 and koobs-freebsd10 for 3.x and 3.4, respectively.

Both logs are attached, inlined failure below:

======================================================================
FAIL: test_pause_reading (test.test_asyncio.test_subprocess.SubprocessSafeWatcherTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.4.koobs-freebsd10/build/Lib/test/test_asyncio/test_subprocess.py", line 194, in test_pause_reading
    self.assertTrue(called)
AssertionError: False is not true

======================================================================
msg231699 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-11-26 08:36
> Buildbot failures observed on koobs-freebsd9 and koobs-freebsd10 for 3.x and 3.4, respectively.

I'm unable to reproduce the failure with "./python -m test -F test_asyncio" or "./python -m test -F -m test_pause_reading test_asyncio" on my FreeBSD 9.1 VM.
msg231821 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-11-28 17:03
New changeset 737355f61ba2 by Victor Stinner in branch '3.4':
Issue #22685: Debug test_pause_reading() on FreeBSD
https://hg.python.org/cpython/rev/737355f61ba2
msg231832 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-11-28 21:40
New changeset 0dd91298eb17 by Victor Stinner in branch 'default':
Issue #22685, asyncio: mock also resume_reading in test_pause_reading()
https://hg.python.org/cpython/rev/0dd91298eb17
msg231833 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-11-28 21:42
New changeset 276515d2ceed by Victor Stinner in branch 'default':
Issue #22685, asyncio: resume_reading() must also be called in test_pause_reading()
https://hg.python.org/cpython/rev/276515d2ceed
msg232017 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-12-02 16:49
> Buildbot failures observed on koobs-freebsd9 and koobs-freebsd10 for 3.x and 3.4, respectively.

It looks like test_asyncio pass on the last 5 builds of the following buildbots, and so I consider the issue as closed.

http://buildbot.python.org/all/builders/AMD64 FreeBSD 9.x 3.x
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.0%203.x
msg232156 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-12-04 22:09
New changeset 2c18dd99829c by Victor Stinner in branch '3.4':
Issue #22685: Fix test_pause_reading() of asyncio test_subprocess
https://hg.python.org/cpython/rev/2c18dd99829c
msg234098 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-01-15 22:08
New changeset 992ce0dcfb29 by Victor Stinner in branch '3.4':
Issue #22685: Fix test_pause_reading() of asyncio/test_subprocess
https://hg.python.org/cpython/rev/992ce0dcfb29
History
Date User Action Args
2022-04-11 14:58:09adminsetgithub: 66875
2015-01-15 22:08:33python-devsetmessages: + msg234098
2014-12-04 22:09:54python-devsetmessages: + msg232156
2014-12-04 21:11:09vstinnersetstatus: open -> closed
2014-12-02 16:49:06vstinnersetmessages: + msg232017
2014-11-28 21:42:18python-devsetmessages: + msg231833
2014-11-28 21:40:12python-devsetmessages: + msg231832
2014-11-28 17:03:11python-devsetmessages: + msg231821
2014-11-26 08:36:41vstinnersetmessages: + msg231699
2014-11-26 04:08:31koobssetfiles: + koobs-freebsd10-python34-build-598.log
2014-11-26 04:08:07koobssetstatus: closed -> open
files: + koobs-freebsd9-python3x-build-2405.log

nosy: + koobs
messages: + msg231691
2014-11-25 21:01:55vstinnersetmessages: + msg231685
2014-11-25 18:47:18wabusetmessages: + msg231678
2014-11-25 16:25:56vstinnersetmessages: + msg231668
2014-11-25 16:25:13python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg231667

resolution: fixed
stage: resolved
2014-11-25 15:37:56gvanrossumsetmessages: + msg231666
2014-11-25 14:37:46vstinnersetmessages: + msg231662
2014-11-05 15:24:22vstinnersetfiles: + set_transport-2.patch

messages: + msg230685
2014-10-21 21:35:12gvanrossumsetmessages: + msg229793
2014-10-21 21:14:43wabusetmessages: + msg229791
2014-10-21 21:10:47vstinnersetmessages: + msg229790
2014-10-21 21:07:46vstinnersetfiles: + test_pause_reading.py

messages: + msg229789
2014-10-21 20:58:09vstinnersetfiles: + set_transport.patch

messages: + msg229787
2014-10-21 20:41:59gvanrossumsetmessages: + msg229786
2014-10-21 20:39:24wabusetmessages: + msg229784
2014-10-21 20:16:56gvanrossumsetfiles: + fix22685.patch
keywords: + patch
messages: + msg229782
2014-10-21 20:12:10wabusetmessages: + msg229781
2014-10-21 20:07:39gvanrossumsetmessages: + msg229779
2014-10-21 19:55:56wabusetmessages: + msg229778
2014-10-21 16:21:20vstinnersetmessages: + msg229767
2014-10-21 12:33:41wabucreate