Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gzip.write changes trailer ISIZE field before type checking - corrupted gz file after trying to write string #65759

Closed
wm75 mannequin opened this issue May 23, 2014 · 7 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@wm75
Copy link
Mannequin

wm75 mannequin commented May 23, 2014

BPO 21560
Nosy @PCManticore, @serhiy-storchaka, @wm75
Files
  • GzipFile_write.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-03-23.13:31:53.274>
    created_at = <Date 2014-05-23.11:22:57.294>
    labels = ['easy', 'type-bug', 'library']
    title = 'gzip.write changes trailer ISIZE field before type checking - corrupted gz file after trying to write string'
    updated_at = <Date 2015-03-23.13:31:53.272>
    user = 'https://github.com/wm75'

    bugs.python.org fields:

    activity = <Date 2015-03-23.13:31:53.272>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-03-23.13:31:53.274>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2014-05-23.11:22:57.294>
    creator = 'wolma'
    dependencies = []
    files = ['35323']
    hgrepos = []
    issue_num = 21560
    keywords = ['patch', 'easy']
    message_count = 7.0
    messages = ['218961', '218963', '218964', '218965', '218966', '239016', '239017']
    nosy_count = 5.0
    nosy_names = ['nadeem.vawda', 'Claudiu.Popa', 'python-dev', 'serhiy.storchaka', 'wolma']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21560'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented May 23, 2014

    I ran into this:

    >>> gzout = gzip.open('test.gz','wb')
    >>> gzout.write('abcdefgh') # write expects bytes not str
    Traceback (most recent call last):
      File "<pyshell#2>", line 1, in <module>
        gzout.write('abcdefgh')
      File "/usr/lib/python3.4/gzip.py", line 343, in write
        self.crc = zlib.crc32(data, self.crc) & 0xffffffff
    TypeError: 'str' does not support the buffer interface
    
    >>> gzout.write(b'abcdefgh') # ok, use bytes instead
    8
    >>> gzout.close()

    But now the file is not recognized as valid gzip format anymore (neither by the gzip module nor by external software):

    >>> gzin = gzip.open('test.gz','rb')
    >>> next(gzin)
    Traceback (most recent call last):
      File "<pyshell#32>", line 1, in <module>
        next(gzin)
      File "/usr/lib/python3.4/gzip.py", line 594, in readline
        c = self.read(readsize)
      File "/usr/lib/python3.4/gzip.py", line 365, in read
        if not self._read(readsize):
      File "/usr/lib/python3.4/gzip.py", line 465, in _read
        self._read_eof()
      File "/usr/lib/python3.4/gzip.py", line 487, in _read_eof
        raise OSError("Incorrect length of data produced")
    OSError: Incorrect length of data produced
    
    Turns out that gzip.write increased the ISIZE field value by 8 already during the failed call with the str object, so it is now 16 instead of 8:
    >>> raw = open('test.gz','rb')
    >>> [n for n in raw.read()] # ISIZE is the fourth last element
    [31, 139, 8, 8, 51, 46, 127, 83, 2, 255, 116, 101, 115, 116, 0, 75, 76, 74, 78, 73, 77, 75, 207, 0, 0, 80, 42, 239, 174, 16, 0, 0, 0]

    in other words: gzip.GzipFile.write() leaps (and modifies) before it checks its input argument.

    @wm75 wm75 mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 23, 2014
    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented May 23, 2014

    ok, this seems to be really easy:
    patch attached

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented May 23, 2014

    or not - my patch just causes a different error in my example :(

    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented May 23, 2014

    Moving self.crc = zlib.crc32(data, self.crc) & 0xffffffff before self.size = self.size + len(data) should be enough. Also, your patch needs a test.

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented May 23, 2014

    isn't this exactly what I did in my patch ?

    actually, it is working, I just had an error in my preliminary test script.

    I may be able to work on an official test at some point, but definitely not over the next week

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 23, 2015

    New changeset 4dfe0634d11a by Serhiy Storchaka in branch '2.7':
    Issue bpo-21560: An attempt to write a data of wrong type no longer cause
    https://hg.python.org/cpython/rev/4dfe0634d11a

    New changeset 6eb48b22ff5c by Serhiy Storchaka in branch '3.4':
    Issue bpo-21560: An attempt to write a data of wrong type no longer cause
    https://hg.python.org/cpython/rev/6eb48b22ff5c

    @serhiy-storchaka
    Copy link
    Member

    Tests are taken from bpo-23688. Thanks for your contribution Wolfgang.

    @serhiy-storchaka serhiy-storchaka self-assigned this Mar 23, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant