Author teamnoir
Recipients lars.gustaebel, nadeem.vawda, r.david.murray, serhiy.storchaka, teamnoir
Date 2013-08-16.21:37:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1376689024.8.0.57393226484.issue18744@psf.upfronthosting.co.za>
In-reply-to
Content
I see your point.

The alternative would be to limit the size of archive that can be extracted from to the size of virtual memory, which is essentially what I'm doing manually.  Either way, someone will be surprised.  I'm not which which way will result in the least surprise since I suspect that far more people will be extracting from compressed archives than will be extracting very large archives.  The failure mode with limited file size seems much less frequent but also much more annoying.  In comparison, the failure, (and the pathological case is effectively a failure), reading compressed archives seems much more common to me, although granted, not completely a total failure.

I think this should be mentioned in the doc because I, at least, was extremely surprised by this behavior and it cost me some time to track it down.  I might suggest something along the lines of:

Be careful when working with compressed archives.  In order to support the largest file sizes possible, some approaches may result in pathological behavior causing the original archive to be decompressed, in full, many times.  You should be able to avoid this behavior if you traverse the TarInfo items in file order.  You might also consider decompressing the archive first, in memory, and then handing the memory copy to tarfile for processing.
History
Date User Action Args
2013-08-16 21:37:04teamnoirsetrecipients: + teamnoir, lars.gustaebel, nadeem.vawda, r.david.murray, serhiy.storchaka
2013-08-16 21:37:04teamnoirsetmessageid: <1376689024.8.0.57393226484.issue18744@psf.upfronthosting.co.za>
2013-08-16 21:37:04teamnoirlinkissue18744 messages
2013-08-16 21:37:04teamnoircreate