Title: [regression] reading from a urllib2 file descriptor happens byte-at-a-time
msg65219 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2008-04-08 21:15
r61009 on the 2.5 branch

  - Bug #1389051, 1092502: fix excessively large memory allocations when
    calling .read() on a socket object wrapped with makefile(). 

causes a regression compared to 2.4.5 and 2.5.2:

When reading from urllib2 file descriptor, python will read the data a
byte at a time regardless of how much you ask for. python versions up to
2.5.2 will read the data in 8K chunks.

This has enough of a performance impact that it increases download time
for a large file over a gigabit LAN from 10 seconds to 34 minutes. (!)

Trivial/obvious example code:

  f =
  while 1:
    chunk =

... and then strace it to see the recv()'s chugging along, one byte at a
msg65488 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-04-14 22:11
See #2632 for more discussion of what is probably the same issue.
msg65503 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-04-15 06:11
Bumping the priority.  I'd like to see this fixed before the next
release.  What version(s) does this problem apply to: 2.5, 2.6, 3.0?
msg65504 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-15 06:21
quoting "Applied to 2.6 trunk in
rev. 61008 and to 2.5-maint in rev. 61009."

I don't know about py3k...
msg65517 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-04-15 13:15
It was applied to 2.5-maint after 2.5.2 was released, BTW, so the change
isn't in any stable released version, only the 2.6 alphas.
msg65538 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2008-04-16 01:23
So if the fix was applied to 2.5 branch and 2.6 (3.0 should have
picked up from 2.6 automatically), can we close this bug?
msg65539 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-16 02:18
I don't think the fix was acceptable.  Now python spins consuming all
cpu trying to read trivial amounts of data one byte at a time...

See the discusson at the end of as
well as a recent python-dev thread:
msg65540 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-04-16 02:21
or else i'm missing something here in the maze of three bugs talking
about the same issue..

which revisions fixed the introduced performance issue?
msg65545 - (view) Author: Ralf Schmitt (schmir) Date: 2008-04-16 06:02
me and amk are talking about the commit that introduced this bug (which
was meant as a fix for another bug).
neal seems to think that this commit is the fix to this bug itself.
and gregory, you are now confused :)

hope it's clear now.
msg65990 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008-04-30 05:55
For those trying to follow along at home: best I can tell we have 3
other issues on this: #1092502 and #1389051 are dupes of an initial bug,
but the fix for those bugs caused regressions reported in this bug and
in #2632.  To try and reduce confusion I'm closing this as a dupe of
#2632 which has a patch for review.
