msg276514 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-15 04:31 |
Each time I run a shell command via the 'subprocess' module, I notice that the memory footprint of my program increases by roughly 4 KiB.
I've tested the problem with two different slices of code; the result is the same in either case (long after the function finishes).
Code slice 1:
check_output("true")
Code slice 2:
pTest = Popen("true", stdout=PIPE, stderr=PIPE)
pTest.wait()
pTest.stdout.close()
pTest.stderr.close()
del pTest
gc.collect()
I'm using Python v3.5.2-1 on Arch Linux; it was installed via the [extra] repository. Let me know if you need any further information.
|
msg276515 - (view) |
Author: Antti Haapala (ztane) * |
Date: 2016-09-15 05:11 |
3.5.1+ ubuntu; I run the Popen case in while True, and watch `top` - not a single digit changes in the memory usage (the last digit being the kilobytes). That the memory footprint increases *once* by 4KiB is nothing; please run this in a loop.
|
msg276517 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-15 05:54 |
I wouldn't have reported this if it was only happening *once*. I already have it in a loop; a new shell command is fired every second.
The memory footprint increases by roughly 4 KiB *each* time. I monitor it via the following Bash script:
while true; do
ps -C "python3 ./Program.pyw" -o pid=,%mem=,rss= >> ./Output.log
sleep 1
done &
I have attached the logfile for your convenience. Let me know if you'd like me to run any other tests.
|
msg276523 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-15 07:33 |
It's easier to reproduce the issue if you use a timer (rather than a loop). The newly attached logfile was generated with the following code fragment.
def fTest() :
check_output("true")
threading.Timer(1, fTest, ()).start()
|
msg276539 - (view) |
Author: Antti Haapala (ztane) * |
Date: 2016-09-15 09:24 |
Ahhah, the title should say: subprocess module leaks 4kiB memory **per thread**.
|
msg276775 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-17 07:01 |
Okay, I've modified the title to match what I've written below.
I've just run some further tests on this problem. The attached archive contains code samples and the output generated (both with and without garbage collection).
As you can see, the memory stays constant in the first case. In the second, the presence of the loop is probably what throws a spanner in the works. Garbage collection seems to make the outcome slightly worse (in the second case).
The situation isn't as bad as I first reported, but the memory does nonetheless keep increasing in the second case (which it probably shouldn't).
|
msg276794 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-09-17 12:56 |
Could you post files instead of a zip, please? It will be easier to review.
|
msg276850 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-17 21:14 |
I put them into an archive so that the folder hierarchy would be preserved. Doing it that way makes it faster for you guys to run the tests at your end.
Nonetheless, I will post the seven (7) files individually as well. It doesn't look like I can upload more than one at a time, so get ready for a few emails!
|
msg276950 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-09-19 06:56 |
I can't reproduce this with either python3.4.3 or 3.5 or 3.6 tip running it on gentoo linux. For me it bumps up initially but then remains constant even if I let it run for many more probes than in your example.
I'm not sure what to suggest to you for further debugging this. It is surprising that Arch would have a different behavior than Gentoo in this context. If we are lucky maybe someone else will be able to reproduce it.
|
msg276986 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-19 21:57 |
Wow, that is surprising (given how simple it is)! Did you try both tests? Remember that only the second one produces the bug here.
Let's leave this sit for a while. If no-one else can reproduce it on their OSs/distributions, I'll seek advice from the Arch community.
|
msg276987 - (view) |
Author: R. David Murray (r.david.murray) * |
Date: 2016-09-19 22:10 |
I only ran the second one. I didn't bother with the first one :)
|
msg276989 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-09-19 22:54 |
I'm unable to reproduce any memory leak on subprocess itself:
---
import tracemalloc; tracemalloc.start()
import subprocess, gc
def func(loops) :
for x in range(loops):
proc = subprocess.Popen(['true'])
with proc:
proc.wait()
# warmup
func(10)
gc.collect();gc.collect();gc.collect()
print(tracemalloc.get_traced_memory()[1])
func(100)
gc.collect();gc.collect();gc.collect()
print(tracemalloc.get_traced_memory()[1])
gc.collect();gc.collect();gc.collect()
print(tracemalloc.get_traced_memory()[1])
func(100)
gc.collect();gc.collect();gc.collect()
print(tracemalloc.get_traced_memory()[1])
---
Output on Fedora 24 (Linux) and Python 3.5:
---
996450
996450
996450
996450
---
|
msg276990 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-09-19 22:58 |
No memory leak if subprocess is spawned in a thread neither:
---
import tracemalloc; tracemalloc.start()
import subprocess, threading, time, gc
def spawn(event) :
subprocess.check_output("true")
gc.collect(), gc.collect(), gc.collect()
event.set()
def func(loops):
event = threading.Event()
for x in range(loops):
event.clear()
timer = threading.Timer(0, spawn, (event,))
timer.start()
event.wait()
func(100)
gc.collect();gc.collect();gc.collect();gc.collect()
a = tracemalloc.get_traced_memory()[1]
print("first", a, "B")
loops = 1000
func(loops)
gc.collect();gc.collect();gc.collect();gc.collect()
b = tracemalloc.get_traced_memory()[1]
print("after", loops, "loops, mem:", b, "B")
d = (b-a) / loops
print("diff: %.1f B/loop" % d)
loops = 1000
func(loops)
gc.collect();gc.collect();gc.collect();gc.collect()
c = tracemalloc.get_traced_memory()[1]
print("after", loops, "loops, mem:", c, "B")
d = (c-b) / loops
print("diff2: %.1f B/loop" % d)
---
Output:
---
first 1013738 B
after 1000 loops, mem: 1014266 B
diff: 0.5 B/loop
after 1000 loops, mem: 1014318 B
diff2: 0.1 B/loop
---
Sorry, 0.5 byte/loop is not a memory leak :-)
|
msg276994 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-20 01:06 |
What about when you test it using the files I provided? I didn't want you guys to have to write your own code.
Note that I was monitoring the memory externally (via good old 'ps'). This could make a difference to the outcome.
|
msg277009 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-09-20 05:25 |
If tracemalloc doesn't show any leak but the RSS memory increases, it can
be memory fragmentation or memory alloctions not traced by tracemalloc.
|
msg277010 - (view) |
Author: Antti Haapala (ztane) * |
Date: 2016-09-20 05:48 |
The title of the issue is still wrong. As I noted before the problem is not with subprocess leaking 4K memory *always*. The issue comes from the fact that subprocess seems to leak 4K memory per individual thread. The test code to use is thus
def test():
check_output("true")
threading.Timer(1, test, ()).start()
test()
which will invoke subprocess always in a new thread. Using subprocess in a loop, or using the timer as above without subprocess will not increase memory usage.
I have changed the title accordingly
|
msg277012 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-20 06:12 |
haypo: So, what is the result when you run "Test-2.py" and monitor the memory usage with "Test.sh"?
ztane: The code you've provided is the same as "Test-1.py". You need to run "Test-2.py" in order to see the bug!
|
msg277015 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2016-09-20 07:43 |
Test-2.py has issues:
* it doesn't call Timer.join()
* it uses a weak synchronization between the main thread and the Timer thread: see msg276990 for an example using Event
If you use a better synchronization code, call timer.join() and call gc.collect(), the memory usage is very stable even after creating more than 100 000 threads + subprocesses.
Xavion: "As I noted before the problem is not with subprocess leaking 4K memory *always*. The issue comes from the fact that subprocess seems to leak 4K memory per individual thread."
I'm unable to reproduce a *leak*.
Test-3.py output:
----
497 thread+subprocess
VmRSS: 9584 kB
986 thread+subprocess
VmRSS: 9596 kB <== after the warmup, the usage seems stable
1490 thread+subprocess
VmRSS: 9596 kB
(...)
10361 thread+subprocess
VmRSS: 9596 kB
(...)
30282 thread+subprocess
VmRSS: 9596 kB
30695 thread+subprocess
VmRSS: 9672 kB
31160 thread+subprocess
VmRSS: 9684 kB <=== memory usage decreases :-)
(...)
60768 thread+subprocess
VmRSS: 9684 kB
^C
---
If you really want to say that something is wrong: I don't understand why we must call gc.collect() to keep the memory usage stable. But I guess that the GC is not always called for performance.
Without the GC it's not that bad:
---
1083 thread+subprocess
VmRSS: 9764 kB
2097 thread+subprocess
VmRSS: 9888 kB
3136 thread+subprocess
VmRSS: 9888 kB
(...)
11750 thread+subprocess
VmRSS: 9888 kB
12668 thread+subprocess
VmRSS: 9940 kB
13705 thread+subprocess
VmRSS: 9940 kB
(...)
70948 thread+subprocess
VmRSS: 9940 kB
^C
---
There is no such "4k leak per function call".
I close the issue. It's a bug in your code, not in Python.
|
msg277077 - (view) |
Author: Xavion (Xavion) |
Date: 2016-09-20 22:40 |
Firstly, you've misquoted me. The quote you attributed to me in your latest post was actually made by 'ztane'.
Secondly, your extra thread/event code makes no difference here. I will attach the memory usage logs in subsequent posts.
For consistency, I have removed all of the collateral stuff from your "Test-3.py" script and reattached it here as "Test-3a.py".
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:36 | admin | set | github: 72352 |
2016-09-20 22:41:54 | Xavion | set | files:
+ Test-3a-gc.log |
2016-09-20 22:41:14 | Xavion | set | files:
+ Test-3a-no-gc.log |
2016-09-20 22:40:14 | Xavion | set | status: closed -> open files:
+ Test-3a.py title: The 'subprocess' module leaks 4 kiB memory for each thread -> The 'subprocess' module leaks memory when called in certain ways messages:
+ msg277077
resolution: not a bug -> |
2016-09-20 07:43:53 | vstinner | set | status: open -> closed files:
+ Test-3.py resolution: not a bug messages:
+ msg277015
|
2016-09-20 06:12:23 | Xavion | set | messages:
+ msg277012 |
2016-09-20 05:48:15 | ztane | set | messages:
+ msg277010 title: The 'subprocess' module leaks memory when called in certain ways -> The 'subprocess' module leaks 4 kiB memory for each thread |
2016-09-20 05:25:39 | vstinner | set | messages:
+ msg277009 |
2016-09-20 01:06:56 | Xavion | set | messages:
+ msg276994 |
2016-09-19 22:58:35 | vstinner | set | messages:
+ msg276990 |
2016-09-19 22:54:14 | vstinner | set | messages:
+ msg276989 |
2016-09-19 22:20:37 | vstinner | set | nosy:
+ vstinner
|
2016-09-19 22:10:04 | r.david.murray | set | messages:
+ msg276987 |
2016-09-19 21:57:15 | Xavion | set | messages:
+ msg276986 |
2016-09-19 06:56:30 | r.david.murray | set | messages:
+ msg276950 |
2016-09-17 21:20:00 | Xavion | set | files:
+ Test-2-gc.log |
2016-09-17 21:19:30 | Xavion | set | files:
+ Test-2-no-gc.log |
2016-09-17 21:19:10 | Xavion | set | files:
+ Test-1-gc.log |
2016-09-17 21:18:49 | Xavion | set | files:
+ Test-1-no-gc.log |
2016-09-17 21:17:19 | Xavion | set | files:
+ Test-2.py |
2016-09-17 21:16:46 | Xavion | set | files:
+ Test-1.py |
2016-09-17 21:16:12 | Xavion | set | files:
+ Test.sh |
2016-09-17 21:14:53 | Xavion | set | files:
+ Memory-Leak-Test.zip
messages:
+ msg276850 |
2016-09-17 20:51:40 | Xavion | set | files:
- Memory-Leak-Test.zip |
2016-09-17 12:56:44 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg276794
|
2016-09-17 07:01:52 | Xavion | set | files:
+ Memory-Leak-Test.zip
messages:
+ msg276775 title: The 'subprocess' module leaks roughly 4 KiB of memory per call -> The 'subprocess' module leaks memory when called in certain ways |
2016-09-17 06:22:27 | Xavion | set | files:
- Output.log |
2016-09-15 20:30:57 | ppperry | set | type: resource usage |
2016-09-15 09:24:16 | ztane | set | messages:
+ msg276539 |
2016-09-15 08:42:09 | The Compiler | set | nosy:
+ The Compiler
|
2016-09-15 07:33:41 | Xavion | set | files:
+ Output.log
messages:
+ msg276523 |
2016-09-15 07:29:58 | Xavion | set | files:
- Output.log |
2016-09-15 05:54:28 | Xavion | set | files:
+ Output.log
messages:
+ msg276517 |
2016-09-15 05:11:55 | ztane | set | nosy:
+ ztane messages:
+ msg276515
|
2016-09-15 04:31:15 | Xavion | create | |