Author nadeem.vawda
Recipients christian.heimes, eric.araujo, nadeem.vawda, pitrou, serhiy.storchaka
Date 2012-11-05.01:25:13
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1352078716.48.0.00642826117575.issue15955@psf.upfronthosting.co.za>
In-reply-to
Content
I agree that being able to limit output size is useful and desirable, but
I'm not keen on copying the max_length/unconsumed_tail approach used by
zlib's decompressor class. It feels awkward to use, and it complicates
the implementation of the existing decompress() method, which is already
unwieldy enough.

As an alternative, I propose a thin wrapper around the underlying C API:

    def decompress_into(self, src, dst, src_start=0, dst_start=0): ...

This would store decompressed data in a caller-provided bytearray, and
return a pair of integers indicating the end points of the consumed and
produced data in the respective buffers.

The implementation should be extremely simple - it does not need to do
any memory allocation or reference management.

I think it could also be useful for optimizing the implementation of
BZ2File and LZMAFile. I plan to write a prototype and run some benchmarks
some time in the next few weeks.

(Aside: if implemented for zlib, this could also be a nicer (I think)
 solution for the problem raised in issue 5804.)
History
Date User Action Args
2012-11-05 01:25:16nadeem.vawdasetrecipients: + nadeem.vawda, pitrou, christian.heimes, eric.araujo, serhiy.storchaka
2012-11-05 01:25:16nadeem.vawdasetmessageid: <1352078716.48.0.00642826117575.issue15955@psf.upfronthosting.co.za>
2012-11-05 01:25:16nadeem.vawdalinkissue15955 messages
2012-11-05 01:25:13nadeem.vawdacreate