This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author doko
Recipients akuchling, doko
Date 2008-04-08.21:15:28
SpamBayes Score 0.019005295
Marked as misclassified No
Message-id <1207689330.25.0.805272484455.issue2601@psf.upfronthosting.co.za>
In-reply-to
Content
r61009 on the 2.5 branch

  - Bug #1389051, 1092502: fix excessively large memory allocations when
    calling .read() on a socket object wrapped with makefile(). 

causes a regression compared to 2.4.5 and 2.5.2:

When reading from urllib2 file descriptor, python will read the data a
byte at a time regardless of how much you ask for. python versions up to
2.5.2 will read the data in 8K chunks.

This has enough of a performance impact that it increases download time
for a large file over a gigabit LAN from 10 seconds to 34 minutes. (!)

Trivial/obvious example code:

  f =
urllib2.urlopen("http://launchpadlibrarian.net/13214672/nexuiz-data_2.4.orig.tar.gz")
  while 1:
    chunk = f.read()

... and then strace it to see the recv()'s chugging along, one byte at a
time.
History
Date User Action Args
2008-04-08 21:15:30dokosetspambayes_score: 0.0190053 -> 0.019005295
recipients: + doko, akuchling
2008-04-08 21:15:30dokosetspambayes_score: 0.0190053 -> 0.0190053
messageid: <1207689330.25.0.805272484455.issue2601@psf.upfronthosting.co.za>
2008-04-08 21:15:29dokolinkissue2601 messages
2008-04-08 21:15:28dokocreate