This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nadeem.vawda
Recipients benjamin.peterson, nadeem.vawda, pitrou, stutzbach
Date 2011-10-12.16:01:59
SpamBayes Score 0.00011762472
Marked as misclassified No
Message-id <1318435320.7.0.151419273993.issue13159@psf.upfronthosting.co.za>
In-reply-to
Content
As mentioned in issue 6715, the buffer growth strategy used by _io.FileIO
has quadratic running time if each reallocation is O(n). The code in
question is new_buffersize() from Modules/_io/fileio.c:

    if (currentsize > SMALLCHUNK) {
        /* Keep doubling until we reach BIGCHUNK;
           then keep adding BIGCHUNK. */
        if (currentsize <= BIGCHUNK)
            return currentsize + currentsize;
        else
            return currentsize + BIGCHUNK;
    }
    return currentsize + SMALLCHUNK;

Is there a reason for this? One possible improvement would be to instead
use the same formula as list.resize() in Objects/listobject.c:

    new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

which has amortized O(n) running time behaviour.

Your thoughts?
History
Date User Action Args
2011-10-12 16:02:00nadeem.vawdasetrecipients: + nadeem.vawda, pitrou, benjamin.peterson, stutzbach
2011-10-12 16:02:00nadeem.vawdasetmessageid: <1318435320.7.0.151419273993.issue13159@psf.upfronthosting.co.za>
2011-10-12 16:02:00nadeem.vawdalinkissue13159 messages
2011-10-12 16:01:59nadeem.vawdacreate