Title: subprocess.Popen() is extremely slow
Type: Stage:
Components: Library (Lib) Versions: Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Alexander.Pyhalov, josh.r, movement, nickhendo, pitrou, vstinner
Priority: normal Keywords:

Created on 2019-08-07 20:36 by Alexander.Pyhalov, last changed 2019-09-09 07:26 by vstinner.

File name Uploaded Description Edit Alexander.Pyhalov, 2019-08-17 02:47
Messages (10)
msg349197 - (view) Author: Alexander Pyhalov (Alexander.Pyhalov) Date: 2019-08-07 20:36
We've moved illumos-gate wsdiff python tool from Python 2 to Python 3.
The tool does the following - for each file from old and new proto area compares file attributes to find differences in binary otput (spawning elfdump, dump and other utilities). 
Performance has degraded in two times when commands.getstatusoutput() was replaced by subprocess.getstatusoutput().
os.popen() now is Popen() wrapper, so it also has poor performance.
Even naive popen() implementation using os.system() and os.mkstemp() behaved much more efficiently (about two times faster e.g. 20 minuts vs 40 minutes for single tool pass).
msg349207 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-08-08 01:26
Have you tried switching to using Popen itself (or run, which keeps it to one layer of convenience wrapping)?

subprocess.getstatusoutput is three layers of wrapping (it calls check_output, which in turn calls run, which in turn calls Popen), and (unlike Python 2) has to decode all the output. run would avoid two layers of wrapping, can optionally receive the raw bytes instead of decoding to str, and avoids needing to wrap the whole thing in a shell (which system, older popen, and getstatusoutput all do).

Beyond that, it looks like when 3.8 releases, Popen should get *much* faster if the call meets certain conditions, see for details. If you can make your use cases conform to those requirements (e.g. by using shutil.which to determine the absolute paths to your utilities instead of relying on PATH lookup), the speed up should eliminate (or more than eliminate) the performance regression you're seeing (per #35537, on macOS, which got the least benefit, it was still a 1.7x improvement; on other OSes, the multiplier hits 61x or 106x).
msg349211 - (view) Author: Alexander Pyhalov (Alexander.Pyhalov) Date: 2019-08-08 04:18
I've tried to rewrite subporcess.getstatusoutput() usage with subprocess.Popen() and switch to shell=False, it didn't help, so I doubti it getstatusoutput() overhead, it's Popen() issue.
msg349857 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-08-16 12:33
Can you post a simple script showcasing the performance degradation?
msg349893 - (view) Author: Alexander Pyhalov (Alexander.Pyhalov) Date: 2019-08-17 02:47
I couldn't reproduce entire test, as wsdiff script is rather large, but here's the simplified version. If Popen is used more often, difference is much more significant.

# Using workaround
$ python3.5 ~/tmp/ 1 
# Without workaround
$ python3.5 ~/tmp/
# Using python2.7
$ python2.7 ~/tmp/
msg349894 - (view) Author: Alexander Pyhalov (Alexander.Pyhalov) Date: 2019-08-17 02:53
Even if I use 
                        import subprocess
                        process = subprocess.Popen(cmd,
difference in times are the same.
msg349897 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-08-17 09:18
I don't seen any significant difference here (Ubuntu 18.04):

$ time python2 

real	0m10,362s
user	0m6,565s
sys	0m4,372s

$ time python3 

real	0m11,781s
user	0m7,356s
sys	0m5,239s
msg350064 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-21 11:06
This issue lacks a lot of information:

* What is your operating system (name and version)? On Linux, what is your Linux kernel version?
* Which Python version did you try?
* Which command are you running?
* Do you use a shell?
* Do you use bytes (default) or Unicode (universal_newlines=True or text=True)?
* Can you provide a minimum reproducer? I don't know how to use msg349894: cmd is not defined.

Attached uses 2 commands, appararently both use a shell:

* "find /usr/bin -type f 2>/dev/null"
* "objdump '%s'"

"objdump '%s'" is unsafe and can lead to shell injection: try to avoid the usage of a shell. Use subprocess.Popen directly, or an helper which doesn't use shell=True.
msg351343 - (view) Author: Nick Henderson (nickhendo) Date: 2019-09-09 03:20
We have recently bumped into a similar problem. Using FreeBSD, subprocess calls were taking more than 10 times the usual time to execute after migrating to python3.6. After some digging, the default for 'close_fds' was changed to 'True'. On linux, this actually made things faster, but for unix, much slower. Passing 'close_fds=False' solved this for us.
msg351352 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 07:26
I create bpo-38061 "FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom()".
Date User Action Args
2019-09-09 07:26:12vstinnersetmessages: + msg351352
2019-09-09 03:20:52nickhendosetnosy: + nickhendo
messages: + msg351343
2019-08-21 11:06:13vstinnersetnosy: + vstinner
messages: + msg350064
2019-08-17 09:18:39pitrousetmessages: + msg349897
2019-08-17 02:53:20Alexander.Pyhalovsetmessages: + msg349894
2019-08-17 02:47:34Alexander.Pyhalovsetfiles: +

messages: + msg349893
2019-08-16 12:33:05pitrousetnosy: + pitrou
messages: + msg349857
2019-08-15 18:48:06movementsetnosy: + movement
2019-08-08 04:18:34Alexander.Pyhalovsetmessages: + msg349211
2019-08-08 01:26:09josh.rsetnosy: + josh.r
messages: + msg349207
2019-08-07 20:36:00Alexander.Pyhalovcreate