Author vnummela
Recipients nadeem.vawda, vnummela
Date 2014-06-25.18:28:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1403720935.87.0.990494824612.issue21872@psf.upfronthosting.co.za>
In-reply-to
Content
Python lzma library sometimes fails to decompress a file, even though the file does not appear to be corrupt. 

Originally discovered with OS X 10.9 / Python 2.7.7 / bacports.lzma
Now also reproduced on OS X / Python 3.4 / lzma, please see
https://github.com/peterjc/backports.lzma/issues/6 for more details.

Two example files are provided, a good one and a bad one. Both are compressed using the older lzma algorithm (not xz). An attempt to decompress the 'bad' file raises "EOFError: Compressed file ended before the end-of-stream marker was reached."

The 'bad' file appears to be ok, because
- a direct call to XZ Utils processes the files without complaints
- the decompressed files' contents appear to be ok.

The example files contain tick data and have been downloaded from the Dukascopy bank's historical data feed service. The service is well known for it's high data quality and utilised by multiple analysis SW platforms. Thus I think it is unlikely that a file integrity issue on their end would have gone unnoticed.

The error occurs relatively rarely; only around 1 - 5 times per 1000 downloaded files.
History
Date User Action Args
2014-06-25 18:28:55vnummelasetrecipients: + vnummela, nadeem.vawda
2014-06-25 18:28:55vnummelasetmessageid: <1403720935.87.0.990494824612.issue21872@psf.upfronthosting.co.za>
2014-06-25 18:28:55vnummelalinkissue21872 messages
2014-06-25 18:28:55vnummelacreate