This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author sbt
Recipients sbt
Date 2012-08-21.22:25:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1345587910.43.0.919180045188.issue15758@psf.upfronthosting.co.za>
In-reply-to
Content
Piping significant amounts of data through a subprocess using Popen.communicate() is crazily slow on Windows.

The attached program just pushes data through mingw's cat.exe.

Python 3.3:
amount = 1 MB; time taken = 0.07 secs; rate = 13.51 MB/s
amount = 2 MB; time taken = 0.31 secs; rate = 6.51 MB/s
amount = 4 MB; time taken = 1.30 secs; rate = 3.08 MB/s
amount = 8 MB; time taken = 5.43 secs; rate = 1.47 MB/s
amount = 16 MB; time taken = 21.64 secs; rate = 0.74 MB/s
amount = 32 MB; time taken = 87.36 secs; rate = 0.37 MB/s

Python 2.7:
amount = 1 MB; time taken = 0.02 secs; rate = 66.67 MB/s
amount = 2 MB; time taken = 0.03 secs; rate = 68.97 MB/s
amount = 4 MB; time taken = 0.05 secs; rate = 76.92 MB/s
amount = 8 MB; time taken = 0.10 secs; rate = 82.47 MB/s
amount = 16 MB; time taken = 0.27 secs; rate = 60.38 MB/s
amount = 32 MB; time taken = 0.88 secs; rate = 36.36 MB/s
amount = 64 MB; time taken = 3.20 secs; rate = 20.03 MB/s
amount = 128 MB; time taken = 12.36 secs; rate = 10.35 MB/s

For Python 3.3 this looks like O(n^2) complexity to me.  2.7 is better but still struggles for large amounts.

Changing Popen._readerthread() to read in chunks rather than using FileIO.readall() produces a huge speed up:

Python 3.3 with patch:
amount = 1 MB; time taken = 0.01 secs; rate = 76.92 MB/s
amount = 2 MB; time taken = 0.03 secs; rate = 76.92 MB/s
amount = 4 MB; time taken = 0.04 secs; rate = 111.10 MB/s
amount = 8 MB; time taken = 0.05 secs; rate = 148.14 MB/s
amount = 16 MB; time taken = 0.10 secs; rate = 156.85 MB/s
amount = 32 MB; time taken = 0.16 secs; rate = 198.75 MB/s
amount = 64 MB; time taken = 0.31 secs; rate = 205.78 MB/s
amount = 128 MB; time taken = 0.61 secs; rate = 209.82 MB/s

Maybe FileIO.readall() should do something similar for files whose size cannot be determined by stat().
History
Date User Action Args
2012-08-21 22:25:10sbtsetrecipients: + sbt
2012-08-21 22:25:10sbtsetmessageid: <1345587910.43.0.919180045188.issue15758@psf.upfronthosting.co.za>
2012-08-21 22:25:09sbtlinkissue15758 messages
2012-08-21 22:25:09sbtcreate