Author nadeem.vawda
Recipients christian.heimes, eric.araujo, nadeem.vawda, pitrou, serhiy.storchaka
Date 2012-12-02.21:52:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1354485169.64.0.67284748913.issue15955@psf.upfronthosting.co.za>
In-reply-to
Content
I've tried reimplementing LZMAFile in terms of the decompress_into()
method, and it has ended up not being any faster than the existing
implementation. (It is _slightly_ faster for readinto() with a large
buffer size, but all other cases it was either of equal performance or
significantly slower.)

In addition, decompress_into() is more complicated to work with than I
had expected, so I withdraw my objection to the approach based on
max_length/unconsumed_tail.


> unconsumed_tail should be private hidden attribute, which automatically prepends any consumed data.

I don't think this is a good idea. In order to have predictable memory
usage, the caller will need to ensure that the current input is fully
decompressed before passing in the next block of compressed data. This
can be done more simply with the interface used by zlib. Compare:

    while not d.eof:
        output = d.decompress(b'', 8192)
        if not output:
            compressed = f.read(8192)
            if not compressed:
                raise ValueError('End-of-stream marker not found')
            output = d.decompress(compressed, 8192)
        # <process output>

with:

    # Using zlib's interface
    while not d.eof:
        compressed = d.unconsumed_tail or f.read(8192)
        if not compressed:
            raise ValueError('End-of-stream marker not found')
        output = d.decompress(compressed, 8192)
        # <process output>


A related, but orthogonal proposal: We might want to make unconsumed_tail
a memoryview (provided the input data is know to be immutable), to avoid
creating an unnecessary copy of the data.
History
Date User Action Args
2012-12-02 21:52:49nadeem.vawdasetrecipients: + nadeem.vawda, pitrou, christian.heimes, eric.araujo, serhiy.storchaka
2012-12-02 21:52:49nadeem.vawdasetmessageid: <1354485169.64.0.67284748913.issue15955@psf.upfronthosting.co.za>
2012-12-02 21:52:49nadeem.vawdalinkissue15955 messages
2012-12-02 21:52:49nadeem.vawdacreate