This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pitrou
Recipients alexandre.vassalotti, pitrou
Date 2008-05-08.03:12:45
SpamBayes Score 0.0055725444
Marked as misclassified No
Message-id <1210216367.0.0.425286483865.issue2523@psf.upfronthosting.co.za>
In-reply-to
Content
Hi Alexandre,

I first tried to use a (non-preallocated) bytearray object and, after
trying several optimization schemes, I found out that the best one
worked as well with an immutable bytes object :) I also found out that
the bytes <-> bytearray conversion costs can be noticeable in some
benchmarks.

The internal buffer is rarely reallocated because the current offset
inside it is remembered instead; also, when reading bytes from the
underlying unbuffered stream, a list of bytes objects is accumulated and
then joined at the end.

I think a preallocated bytearray would not make a lot of sense since we
can't readinto() an arbitrary position, so we still have a memory copy
from the bytes object returned by raw.read() to the bytearray buffer,
and then when returning the result to the user as a bytes object we have
another memory copy. In other words each read byte is copied twice more.

Of course, if this code was rewritten in C, different compromises would
be possible.

cheers

Antoine.
History
Date User Action Args
2008-05-08 03:12:47pitrousetspambayes_score: 0.00557254 -> 0.0055725444
recipients: + pitrou, alexandre.vassalotti
2008-05-08 03:12:47pitrousetspambayes_score: 0.00557254 -> 0.00557254
messageid: <1210216367.0.0.425286483865.issue2523@psf.upfronthosting.co.za>
2008-05-08 03:12:46pitroulinkissue2523 messages
2008-05-08 03:12:45pitroucreate