classification
Title: subprocess.communicate() should use a memoryview
Type: performance Stage: resolved
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith, haypo, neologix, pitrou, python-dev, sbt, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2013-11-05 20:34 by haypo, last changed 2013-12-08 03:18 by gregory.p.smith. This issue is now closed.

Files
File name Uploaded Description Edit
subprocess_memoryview-2.patch haypo, 2013-11-20 13:58 review
test_sub.py neologix, 2013-11-30 09:59
Messages (15)
msg202240 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-05 20:34
The following code copies data, whereas the copy can be avoided using a memory view:

chunk = self._input[self._input_offset:self._input_offset + _PIPE_BUF]
self._input_offset += os.write(key.fd, chunk)

It should be replaced with:

input_view = memoryview(self._input)
...
chunk = input_view[self._input_offset:self._input_offset + _PIPE_BUF]
self._input_offset += os.write(key.fd, chunk)


This issue is a reminder for one of my comment of issue #18923.
msg203449 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 00:07
Attached subprocess_memoryview.patch uses a memoryview() to avoid memory copies.

The behaviour is undefined if two threads share the same Popen object and call the communicate() method at the same time. But I'm not sure that Popen is thread safe, it doesn't use any lock.
msg203450 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-20 00:18
Hi, why does your patch change listobject.c?
msg203451 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 00:19
> Hi, why does your patch change listobject.c?

Oops, the change is unrelated (#19578).
msg203462 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-20 07:28
What is a performance for small chunks?
msg203475 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 13:43
> What is a performance for small chunks?

I have no idea. Does it matter?
msg203478 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-20 13:54
I don't know. But a function call have a cost.
msg203479 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 13:58
New patch without the unwanted change in listobject.c.
msg203484 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-20 14:33
> Serhiy Storchaka added the comment:
>
> I don't know. But a function call have a cost.

I'm with Serhiy here.
Writing a performance optimization without benchmark is generally a
bad idea (I won't quote Knuth :-).
msg204792 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-30 09:59
Could someone with a dual-core machine try the attached simplistic
benchmark with and without Victor's patch?
I can see some user-time difference with 'time' on my single-core
machine, but I'm curious to see how this would affect things were both
the parent and the child subprocess can run concurrently.
msg204797 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-30 10:40
On a quad-core machine:

- without Victor's patch:

$ time ./python test_sub.py 
0.39263957000002847

real	0m0.521s
user	0m0.412s
sys	0m0.238s

- with Victor's patch:

$ time ./python test_sub.py 
0.3856174530001226

real	0m0.516s
user	0m0.404s
sys	0m0.247s
msg204800 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-30 11:09
Best of 10 runs.

Unpatched:  3.91057508099766
Patched:    3.86466505300632
msg204903 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-12-01 10:13
Impressive speedup :-)
msg205508 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-08 03:15
New changeset 44948f5bdc12 by Gregory P. Smith in branch '3.3':
Fixes issue #19506: Use a memoryview to avoid a data copy when piping data
http://hg.python.org/cpython/rev/44948f5bdc12

New changeset 5379bba2fb21 by Gregory P. Smith in branch 'default':
Fixes issue #19506: Use a memoryview to avoid a data copy when piping data
http://hg.python.org/cpython/rev/5379bba2fb21
msg205509 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-12-08 03:17
fwiw, this patch made sense.  using test_sub.py on my dual-core amd64 box with an opt build is saw the times drop from ~0.90-0.97 to ~0.75-0.8.  more speedup than i was really anticipating for this use case.  it probably depends upon what the value of _PIPE_BUF is on your system.
History
Date User Action Args
2013-12-08 03:18:21gregory.p.smithsetstatus: open -> closed
resolution: fixed
stage: resolved
2013-12-08 03:17:52gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg205509
2013-12-08 03:15:11python-devsetnosy: + python-dev
messages: + msg205508
2013-12-01 10:13:47neologixsetmessages: + msg204903
2013-11-30 11:09:34serhiy.storchakasetmessages: + msg204800
2013-11-30 10:40:53pitrousetmessages: + msg204797
2013-11-30 09:59:23neologixsetfiles: + test_sub.py

messages: + msg204792
2013-11-20 14:33:58neologixsetmessages: + msg203484
2013-11-20 13:58:36hayposetfiles: - subprocess_memoryview.patch
2013-11-20 13:58:29hayposetfiles: + subprocess_memoryview-2.patch

messages: + msg203479
2013-11-20 13:54:27serhiy.storchakasetmessages: + msg203478
2013-11-20 13:43:37hayposetmessages: + msg203475
2013-11-20 07:28:30serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg203462
2013-11-20 00:19:51hayposetmessages: + msg203451
2013-11-20 00:18:57pitrousetmessages: + msg203450
2013-11-20 00:07:13hayposetfiles: + subprocess_memoryview.patch

nosy: + pitrou, sbt
messages: + msg203449

keywords: + patch
2013-11-05 20:35:40hayposettype: performance
2013-11-05 20:34:46haypocreate