This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: gzip.write changes trailer ISIZE field before type checking - corrupted gz file after trying to write string
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Claudiu.Popa, nadeem.vawda, python-dev, serhiy.storchaka, wolma
Priority: normal Keywords: easy, patch

Created on 2014-05-23 11:22 by wolma, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
GzipFile_write.patch wolma, 2014-05-23 11:44 review
Messages (7)
msg218961 - (view) Author: Wolfgang Maier (wolma) * Date: 2014-05-23 11:22
I ran into this:

>>> gzout = gzip.open('test.gz','wb')
>>> gzout.write('abcdefgh') # write expects bytes not str
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    gzout.write('abcdefgh')
  File "/usr/lib/python3.4/gzip.py", line 343, in write
    self.crc = zlib.crc32(data, self.crc) & 0xffffffff
TypeError: 'str' does not support the buffer interface

>>> gzout.write(b'abcdefgh') # ok, use bytes instead
8
>>> gzout.close()

But now the file is not recognized as valid gzip format anymore (neither by the gzip module nor by external software):

>>> gzin = gzip.open('test.gz','rb')
>>> next(gzin)
Traceback (most recent call last):
  File "<pyshell#32>", line 1, in <module>
    next(gzin)
  File "/usr/lib/python3.4/gzip.py", line 594, in readline
    c = self.read(readsize)
  File "/usr/lib/python3.4/gzip.py", line 365, in read
    if not self._read(readsize):
  File "/usr/lib/python3.4/gzip.py", line 465, in _read
    self._read_eof()
  File "/usr/lib/python3.4/gzip.py", line 487, in _read_eof
    raise OSError("Incorrect length of data produced")
OSError: Incorrect length of data produced

Turns out that gzip.write increased the ISIZE field value by 8 already during the failed call with the str object, so it is now 16 instead of 8:
>>> raw = open('test.gz','rb')
>>> [n for n in raw.read()] # ISIZE is the fourth last element
[31, 139, 8, 8, 51, 46, 127, 83, 2, 255, 116, 101, 115, 116, 0, 75, 76, 74, 78, 73, 77, 75, 207, 0, 0, 80, 42, 239, 174, 16, 0, 0, 0]

in other words: gzip.GzipFile.write() leaps (and modifies) before it checks its input argument.
msg218963 - (view) Author: Wolfgang Maier (wolma) * Date: 2014-05-23 11:44
ok, this seems to be really easy:
patch attached
msg218964 - (view) Author: Wolfgang Maier (wolma) * Date: 2014-05-23 11:55
or not - my patch just causes a different error in my example :(
msg218965 - (view) Author: PCManticore (Claudiu.Popa) * (Python triager) Date: 2014-05-23 12:02
Moving `self.crc = zlib.crc32(data, self.crc) & 0xffffffff` before `self.size = self.size + len(data)` should be enough. Also, your patch needs a test.
msg218966 - (view) Author: Wolfgang Maier (wolma) * Date: 2014-05-23 12:20
isn't this exactly what I did in my patch ?

actually, it is working, I just had an error in my preliminary test script.

I may be able to work on an official test at some point, but definitely not over the next week
msg239016 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-03-23 13:28
New changeset 4dfe0634d11a by Serhiy Storchaka in branch '2.7':
Issue #21560: An attempt to write a data of wrong type no longer cause
https://hg.python.org/cpython/rev/4dfe0634d11a

New changeset 6eb48b22ff5c by Serhiy Storchaka in branch '3.4':
Issue #21560: An attempt to write a data of wrong type no longer cause
https://hg.python.org/cpython/rev/6eb48b22ff5c
msg239017 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-23 13:31
Tests are taken from issue23688. Thanks for your contribution Wolfgang.
History
Date User Action Args
2022-04-11 14:58:03adminsetgithub: 65759
2015-03-23 13:31:53serhiy.storchakasetstatus: open -> closed

assignee: serhiy.storchaka

nosy: + serhiy.storchaka
messages: + msg239017
resolution: fixed
stage: test needed -> resolved
2015-03-23 13:28:05python-devsetnosy: + python-dev
messages: + msg239016
2014-05-26 08:31:07pitrousetnosy: + nadeem.vawda
2014-05-25 18:59:05serhiy.storchakasetstage: needs patch -> test needed
2014-05-23 12:20:26wolmasetmessages: + msg218966
2014-05-23 12:02:17Claudiu.Popasetnosy: + Claudiu.Popa
messages: + msg218965
2014-05-23 11:55:32wolmasetmessages: + msg218964
2014-05-23 11:44:41wolmasetfiles: + GzipFile_write.patch
keywords: + patch
messages: + msg218963
2014-05-23 11:40:12serhiy.storchakasetkeywords: + easy
stage: needs patch
versions: + Python 2.7, Python 3.5
2014-05-23 11:22:57wolmacreate