msg75173 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-24 17:54 |
I noticed a colleague at work today checked in a change to his code to
switch back from subprocess.Popen to os.popen. I asked him about it and he
reported that subprocess.Popen was about 10x slower than os.popen. I asked
him for a simple test case, which is attached. Here are my results with
Python 2.4 through 2.7 (aka CVS HEAD):
tmp% python2.4 popentest.py
time with os.popen : 0.09
time with subprocess.Popen : 2.27
tmp% python2.5 popentest.py
time with os.popen : 0.03
time with subprocess.Popen : 1.52
tmp% python2.6 popentest.py
time with os.popen : 0.038824
time with subprocess.Popen : 1.517056
tmp% python2.7 popentest.py
time with os.popen : 0.033746
time with subprocess.Popen : 1.512331
These times are on my Mac laptop, all writing to the local disk. It seems
there was a bit of improvement in the 2.5 release but that it is still
miserably slow when compared with os.popen.
His original test used time.clock() as the timer. I changed to time.time()
but got essentially the same result.
Skip
|
msg75177 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-24 18:55 |
Hi Skip,
I find different measurements om Windows/XP: I copied the script and ran
it under Python 2.5.2 and Python 2.6 (as downloaded from
http://python.org/ftp/python/2.6/python-2.6.msi):
Plappert@action-time /cygdrive/e/tmp $ python ./timing.py # 2.5.2
time with os.popen : 0.296999931335
time with subprocess.Popen : 0.203000068665
in other words: identical or subprocess.Popen even faster.
and it is getting even better with Python2.6:
Plappert@action-time /cygdrive/e/tmp $ python26 timing.py # 2.6
time with os.popen : 0.266000032425
time with subprocess.Popen : 0.18700003624
So could that be a Mac OS issue?
|
msg75178 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-24 19:16 |
Good question. I don't think it's MacOSX-specific. The original problem
raised its ugly head on Solaris 10. I don't have quite as many versions
of Python available there, but the relative performance is still bad,
roughly 0.05 for os.popen and 0.82 for subprocess.Popen. Not as bad as on
the Mac, but certainly not as good as what you saw.
|
msg75179 - (view) |
Author: Sameer (sameerd) |
Date: 2008-10-24 19:25 |
The subprocess module does different things depending on whether the
systems are Windows or Posix. This could explain the bad performance on
Solaris and the Mac and the good performance on Windows.
If this is correct, then we should see poor performance on Linux also.
Also, I don't have a box to test that out.
|
msg75180 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-24 19:35 |
Good suggestion Sameer. I tried it out with Python 2.5 on a Linux host
here and saw essentially identical results for the two alternatives (~
0.08s).
S
|
msg75181 - (view) |
Author: David W. Lambert (LambertDW) |
Date: 2008-10-24 19:58 |
cygwin Python 2.5.1 (similar)
time with os.popen : 0.439999818802
time with subprocess.Popen : 0.361000061035
linux python 2.4.2 (similar)
time with os.popen : 0.0344939231873
time with subprocess.Popen : 0.0354421138763
linux python3K (both awful)
time with os.popen : 0.715471029282
time with subprocess.Popen : 1.58627915382
|
msg75182 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-24 20:44 |
I don't expect Python3 to be all that great io performance-wise yet.
Still, for me on the Mac os.popen beats subprocess.Popen pretty handily:
% python3.0 popentest.py
time with os.popen : 0.874988
time with subprocess.Popen : 2.112976
|
msg75184 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-24 21:52 |
Here are my figures from a different processor on Linux (Ubuntu):
wplapper@lin-wpl:~/python$ python2.5 popentest.py
time with os.popen : 0.0478880405426
time with subprocess.Popen : 0.0596849918365
wplapper@lin-wpl:~/python$ python2.6 popentest.py
time with os.popen : 0.0481159687042
time with subprocess.Popen : 0.0592260360718
uname -a: ... 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008 i686
GNU/Linux
|
msg75190 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-24 22:17 |
Yes, I can confirm that the performance is lousy on Solaris.
Solaris-9/Python 2.5.1:
time with os.popen : 0.124045133591
time with subprocess.Popen : 1.60335588455
Solaris-9/Python 2.6:
time with os.popen : 0.115752220154
time with subprocess.Popen : 1.61768198013
Solaris-10/Python 2.4.4:
time with os.popen : 0.124130010605
time with subprocess.Popen : 1.45624995232
Solaris-10/Python 2.6:
time with os.popen : 0.113790988922
time with subprocess.Popen : 1.42739701271
All machines are idendital in processor speed: v240.
|
msg75204 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-25 08:38 |
Hi is the dramatic difference on Solaris-10 / Python2.6:
I dtraced the popentest.py and counted syscalls:
with os_popen: read = 243
with process:Popen read = 589018
That explains a lot!
The rest of the system calls are similir in count and appearance.
I have not looked into the C-source (yet).
|
msg75205 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-25 08:39 |
s/Hi is/Hi, here is/ :)
|
msg75208 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-25 13:23 |
The created testfile size is 588890 bytes, which implies that
subprocess.Popen reads the file in completely unbuffered mode, one byte
at a time. If I modify the popentest.py programme by specifying a
bufsize of 1_000_000, the execution time drops quite a bit:
time with os.popen 0.069 sec
time with subprocess.Popen 0.118 sec
This Solaris 9/Python 2.6.
I think I have got it: Lib/subprocess.py should have a default bufsize
of -1, not 0! I tested it by modifying the subprocess.Popen call and
here is the result:
time with os.popen 0.086 sec
time with subprocess.Popen 0.080 sec
See also Modules/posixmodule.c.
|
msg75209 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-25 13:50 |
Using a nonzero bufsize parameter makes all the difference in the world:
Using the default (bufsize=0 ==> unbuffered):
% python popentest.py
time with os.popen : 0.035032
time with subprocess.Popen : 1.496455
Creating the Popen object with bufsize=8192:
% python popentest.py
time with os.popen : 0.034642
time with subprocess.Popen : 0.041376
Creating the Popen object with bufsize=1 (==> line buffered):
% python popentest.py
time with os.popen : 0.034658
time with subprocess.Popen : 0.04211
Maybe the default for that parameter shouldn't be zero?
Skip
|
msg75211 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-25 14:11 |
>> Using a nonzero bufsize parameter makes all the difference in the
>> world:
...
In fact, looking at posix_popen in posixmodule.c it appears the default
value for bufsize there is -1, implying that I/O is fully buffered.
Even if the default bufsize value for subprocess.Popen is not changed its
documentation should be corrected to reflect the different default values
for bufsize between os.popen and subprocess.Popen:
Replacing os.popen*
-------------------
pipe = os.popen(cmd, mode='r', [bufsize])
==>
pipe = Popen(cmd, shell=True, bufsize=bufsize, stdout=PIPE).stdout
If you use the default bufsize with os.popen you must set bufsize to -1
when creating a Popen object. Otherwise your I/O will be unbuffered and
you will be disappointed at the I/O performance.
|
msg75218 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-10-25 19:03 |
I've been thinking about it, and I think even though it would be a slight
change to the API, I agree with Winfried that the default value for
bufsize should be -1, not 0. In my own use of os.popen and friends,
almost all the time I use them to process the output of a traditional Unix
pipeline, which means the output will be plain text, line buffered. In
those cases it makes absolutely no sense to read from the pipe one
character at a time.
S
|
msg75219 - (view) |
Author: Winfried Plappert (wplappert) |
Date: 2008-10-25 21:04 |
On the other hand, we will silently break all those applications which
are out there relying on the fact that a pipe is an unbuffered device.
You might consider it for Python 3.0, but I don't know if it is a good
idea for Python 2.x. The documentation of subprocess.Popen clear states
that the default is unbuffered, bufsize=0. I quote from
http://docs.python.org/library/subprocess.html:
bufsize, if given, has the same meaning as the corresponding argument to
the built-in open() function: 0 means unbuffered, 1 means line buffered,
any other positive value means use a buffer of (approximately) that
size. A negative bufsize means to use the system default, which usually
means fully buffered. The default value for bufsize is 0 (unbuffered).
|
msg76288 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-24 00:59 |
About Python3, os.popen() is more than two times faster (0.20 sec vs
0.50 sec) than subprocess.Popen()! It's amazing because popen() opens
the standard output as unicode file whereas Popen() creates a binary
file! Another funny thing: os.popen() calls subprocess.Popen() :-) The
difference is just this instruction:
stdout = io.TextIOWrapper(stdout)
|
msg76291 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-24 01:30 |
Summary of unchanged Python (2.4 .. 2.7):
* Mac: subprocess is 25 .. 50 times SLOWER than os.popen
* Solaris : subprocess is 13 times SLOWER than os.popen
* Windows XP : subprocess is 1.5 times FASTER than os.popen
* Linux : (results are very close)
With a different buffer size:
* Solaris : Popen(bufsize=-1) is FASTER than os.popen()
* Mac : Popen(bufsize=1) and Popen(bufsize=8192) are a little bit
slower than os.popen(), but much FASTER than Popen(bufsize=0)
Notes:
- PyFile_SetBufSize(bufsize) does nothing if bufsize < 0: keep system
default (buffer of BUFSIZE bytes)
- On Ubuntu Gutsy, system default (BUFSIZ) is 8192 bytes
|
msg76292 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2008-11-24 01:47 |
If anything for 2.6 lets just highlight this in the docs and mention
that bufsize needs to be set to non-zero for good performance on many
platforms such as Mac OS X and Solaris.
We can consider changing the default for 2.7/3.1.
3.x having poor performance is pretty much another issue entirely of its
own..
|
msg76325 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2008-11-24 15:11 |
Victor> About Python3, os.popen() is more than two times faster (0.20
Victor> sec vs 0.50 sec) than subprocess.Popen()! It's amazing because
Victor> popen() opens the standard output as unicode file whereas
Victor> Popen() creates a binary file! Another funny thing: os.popen()
Victor> calls subprocess.Popen() :-) The difference is just this
Victor> instruction:
Victor> stdout = io.TextIOWrapper(stdout)
This is a known issue. The default for bufsize in os.popen is -1 (fully
buffered? line buffered?). The default for bufsize in subprocess.Popen is 0
(unbuffered). I think it should have been changed but was voted down. I
think the best you can do at this point is document the change, perhaps in
the "Replacing os.popen" section.
Skip
|
msg76328 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2008-11-24 15:32 |
> Victor> About Python3, os.popen() is more than two times faster (...)
> Victor> The difference is just this instruction:
> Victor> stdout = io.TextIOWrapper(stdout)
>
> This is a known issue. The default for bufsize in os.popen is -1 (fully
> buffered? line buffered?). The default for bufsize in subprocess.Popen is
> 0 (unbuffered).
Wrong, it's not related to the buffer size.
With Python3 trunk on Linux, os.popen time is ~0.10 sec whereas
subprocess.Popen is ~0.25 sec. Change the buffer size of subprocess doesn't
help:
- (default) 0.25
- buffering = (-1): 0.25
- buffering = 1: 0.25
- buffering = 8192: 0.26
- buffering = 16384: 0.26
(it's a little big slower with a bigger buffer...)
You get the same speed (than os.popen) using TextIOWrapper() adapter:
[i for i in read_handle] => 0.25 sec
[i for i in io.TextIOWrapper(read_handle)] => 0.10 sec
WTF? Unicode is *FASTER* than raw bytes?
|
msg83723 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2009-03-18 00:21 |
I just wanna say that buffering can be a problem when writing, but not
when reading. If you read() from a buffered file, you still get the
available contents immediately, you don't have to wait for the buffer to
be full.
|
msg83725 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2009-03-18 01:18 |
The strange performance between bytes/text
(BufferedReader/TextIOWrapper) may be explained by the issue #5502.
|
msg98054 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-01-19 13:50 |
Can someone check if this still applies to Python 3.1/3.2?
By the way, I see no reason not to buffer at least the subprocess's stdout and stderr streams.
|
msg98062 - (view) |
Author: Skip Montanaro (skip.montanaro) *  |
Date: 2010-01-19 16:14 |
Looks good to me:
tmp% python3.1 popentest.py
time with os.popen : 0.035565
time with subprocess.Popen : 0.031796
tmp% python3.2 popentest.py
time with os.popen : 0.03501
time with subprocess.Popen : 0.031168
tmp% python3.1
Python 3.1.1+ (release31-maint:77485M, Jan 13 2010, 19:53:41)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
tmp% python3.2
Python 3.2a0 (py3k:77484:77485, Jan 13 2010, 19:49:33)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
|
msg98970 - (view) |
Author: Shashwat Anand (l0nwlf) |
Date: 2010-02-06 23:20 |
Tested it on mac OSX (Snow leopard)
Shashwat-Anands-MacBook-Pro:Desktop l0nwlf$ python2.5 popentest.py
time with os.popen : 0.0342061519623
time with subprocess.Popen : 0.0421631336212
Shashwat-Anands-MacBook-Pro:Desktop l0nwlf$ python2.6 --version
Python 2.6.1
Shashwat-Anands-MacBook-Pro:Desktop l0nwlf$ python2.6 popentest.py
time with os.popen : 0.0282561779022
time with subprocess.Popen : 0.0366630554199
Python 2.5.4s os.popen was faster than subprocess.Popen, the same being the case with Python 2.6.1 I do not think it is a mac issue.
|
msg115645 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2010-09-05 13:08 |
The subprocess doc now has a note about buffering and performance issues, closing.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:40 | admin | set | github: 48444 |
2010-09-05 13:08:43 | pitrou | set | status: open -> closed resolution: out of date messages:
+ msg115645
|
2010-05-20 20:38:33 | skip.montanaro | set | nosy:
- skip.montanaro
|
2010-02-06 23:20:35 | l0nwlf | set | nosy:
+ l0nwlf messages:
+ msg98970
|
2010-01-19 16:14:53 | skip.montanaro | set | messages:
+ msg98062 |
2010-01-19 13:50:31 | pitrou | set | messages:
+ msg98054 versions:
- Python 2.6, Python 2.5, Python 2.4, Python 3.0 |
2010-01-19 09:59:20 | guettli | set | nosy:
+ guettli
|
2009-03-18 01:18:16 | vstinner | set | messages:
+ msg83725 |
2009-03-18 00:21:48 | pitrou | set | nosy:
+ pitrou messages:
+ msg83723
|
2008-11-24 15:32:24 | vstinner | set | messages:
+ msg76328 |
2008-11-24 15:11:56 | skip.montanaro | set | messages:
+ msg76325 |
2008-11-24 01:47:41 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg76292 |
2008-11-24 01:30:33 | vstinner | set | messages:
+ msg76291 |
2008-11-24 00:59:16 | vstinner | set | nosy:
+ vstinner messages:
+ msg76288 |
2008-11-03 17:08:08 | vstinner | set | title: Miserable subprocess.Popen performance -> default subprocess.Popen buffer size |
2008-11-02 06:24:30 | bgh | set | nosy:
+ bgh |
2008-10-25 21:04:31 | wplappert | set | messages:
+ msg75219 |
2008-10-25 19:03:06 | skip.montanaro | set | messages:
+ msg75218 |
2008-10-25 18:26:10 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
2008-10-25 14:11:33 | skip.montanaro | set | assignee: georg.brandl messages:
+ msg75211 nosy:
+ georg.brandl components:
+ Documentation, Library (Lib) versions:
+ Python 3.0, Python 2.7 |
2008-10-25 13:50:32 | skip.montanaro | set | messages:
+ msg75209 |
2008-10-25 13:23:38 | wplappert | set | messages:
+ msg75208 |
2008-10-25 08:39:41 | wplappert | set | messages:
+ msg75205 |
2008-10-25 08:38:40 | wplappert | set | messages:
+ msg75204 |
2008-10-24 22:37:31 | wplappert | set | type: performance versions:
+ Python 2.6, Python 2.5, Python 2.4 |
2008-10-24 22:17:13 | wplappert | set | messages:
+ msg75190 |
2008-10-24 21:52:22 | wplappert | set | messages:
+ msg75184 |
2008-10-24 20:44:33 | skip.montanaro | set | messages:
+ msg75182 |
2008-10-24 19:58:06 | LambertDW | set | nosy:
+ LambertDW messages:
+ msg75181 |
2008-10-24 19:35:03 | skip.montanaro | set | messages:
+ msg75180 |
2008-10-24 19:25:35 | sameerd | set | messages:
+ msg75179 |
2008-10-24 19:16:39 | skip.montanaro | set | messages:
+ msg75178 |
2008-10-24 18:55:25 | wplappert | set | nosy:
+ wplappert messages:
+ msg75177 |
2008-10-24 18:30:29 | sameerd | set | nosy:
+ sameerd |
2008-10-24 17:54:41 | skip.montanaro | create | |