Message 47892 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	calvin
Recipients
Date	2005-03-08.13:57:41
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
The GzipFile algorithm crashes when reading a corrupted .gz file (attached as t.gz) with a missing CRC checksum at the end. Tested with python2.3, python2.4 and CVS python on a Debian Linux system. $ python2.3 t.py Traceback (most recent call last): File "t.py", line 4, in ? print gzip.GzipFile('', 'rb', 9, fileobj).read() File "/usr/lib/python2.3/gzip.py", line 217, in read self._read(readsize) File "/usr/lib/python2.3/gzip.py", line 289, in _read self._read_eof() File "/usr/lib/python2.3/gzip.py", line 305, in _read_eof crc32 = read32(self.fileobj) File "/usr/lib/python2.3/gzip.py", line 40, in read32 return struct.unpack("<l", input.read(4))[0] struct.error: unpack str size does not match format The attached patch (against current CVS) tries to cope with this situation by a) detecting the missing data by examining the rewind value and b) assuming that EOF is reached and returning the buffered uncompressed data (by raising EOFError) For history I encountered this kind of bug when downloading HTML pages with Content-Encoding: gzip. It seems some versions of the mod_gzip Apache module are producing corrupted gzip data.

The GzipFile algorithm crashes when reading a corrupted
.gz file (attached as t.gz) with a missing CRC checksum
at the end.
Tested with python2.3, python2.4 and CVS python on a
Debian Linux system.
$ python2.3 t.py
Traceback (most recent call last):
  File "t.py", line 4, in ?
    print gzip.GzipFile('', 'rb', 9, fileobj).read()
  File "/usr/lib/python2.3/gzip.py", line 217, in read
    self._read(readsize)
  File "/usr/lib/python2.3/gzip.py", line 289, in _read
    self._read_eof()
  File "/usr/lib/python2.3/gzip.py", line 305, in _read_eof
    crc32 = read32(self.fileobj)
  File "/usr/lib/python2.3/gzip.py", line 40, in read32
    return struct.unpack("<l", input.read(4))[0]
struct.error: unpack str size does not match format

The attached patch (against current CVS) tries to cope
with this situation by
a) detecting the missing data by examining the rewind
value and
b) assuming that EOF is reached and returning the
buffered uncompressed data (by raising EOFError)

For history I encountered this kind of bug when
downloading HTML pages with Content-Encoding: gzip. It
seems some versions of the mod_gzip Apache module are
producing corrupted gzip data.

History
Date	User	Action	Args
2007-08-23 15:42:04	admin	link	issue1159051 messages
2007-08-23 15:42:04	admin	create