Author malin
Recipients Esa.Peuha, Jeffrey.Kintscher, akira, josh.r, kenorb, malin, maubp, nadeem.vawda, peremen, serhiy.storchaka, vnummela
Date 2019-06-18.09:33:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1560850392.81.0.108591927419.issue21872@roundup.psfhosted.org>
In-reply-to
Content
I investigated this problem.

Here is the toggle conditions:

- The format is FORMAT_ALONE, this is the legacy .lzma container format.
- The file's header recorded "Uncompressed Size".
- The file doesn't have "End of Payload Marker" or "End of Stream Marker".

Otherwise, liblzma's internal state doesn't hold any bytes that can be output. 

Good news is:

- lzma module's default compressing format is FORMAT_XZ, not FORMAT_ALONE.
- Even FORMAT_ALONE files generated by lzma module (underlying xz library), always have "End of Payload Marker".
- Maybe FORMAT_ALONE format is being outdated in the world.

Attached file test_bad_files.py, test `DecompressReader.read(size=-1)` function [1] with different max_length values (from -1 to 1000, exclude 0), can ensure that the needs_input mechanism works properly.
Usage: modify `DIR` variable to bad files' folder.

[1] https://github.com/python/cpython/blob/v3.8.0b1/Lib/_compression.py#L72-L111
History
Date User Action Args
2019-06-18 09:33:12malinsetrecipients: + malin, nadeem.vawda, akira, maubp, serhiy.storchaka, Esa.Peuha, josh.r, vnummela, kenorb, peremen, Jeffrey.Kintscher
2019-06-18 09:33:12malinsetmessageid: <1560850392.81.0.108591927419.issue21872@roundup.psfhosted.org>
2019-06-18 09:33:12malinlinkissue21872 messages
2019-06-18 09:33:12malincreate