Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gzip cannot handle zero-padded output + patch #47095

Closed
tadek mannequin opened this issue May 13, 2008 · 6 comments
Closed

Gzip cannot handle zero-padded output + patch #47095

tadek mannequin opened this issue May 13, 2008 · 6 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@tadek
Copy link
Mannequin

tadek mannequin commented May 13, 2008

BPO 2846
Nosy @pitrou, @briancurtin
Files
  • python2.5.2-gzip.patch: Patch to fix zero-padded archive handling in gzip.
  • issue2846.diff: change, tests, docs against r77470
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/briancurtin'
    closed_at = <Date 2010-01-13.14:42:16.858>
    created_at = <Date 2008-05-13.22:16:22.794>
    labels = ['extension-modules', 'type-bug']
    title = 'Gzip cannot handle zero-padded output + patch'
    updated_at = <Date 2010-01-13.14:42:16.856>
    user = 'https://bugs.python.org/tadek'

    bugs.python.org fields:

    activity = <Date 2010-01-13.14:42:16.856>
    actor = 'pitrou'
    assignee = 'brian.curtin'
    closed = True
    closed_date = <Date 2010-01-13.14:42:16.858>
    closer = 'pitrou'
    components = ['Extension Modules']
    creation = <Date 2008-05-13.22:16:22.794>
    creator = 'tadek'
    dependencies = []
    files = ['10320', '15856']
    hgrepos = []
    issue_num = 2846
    keywords = ['patch', 'needs review']
    message_count = 6.0
    messages = ['66806', '97684', '97686', '97694', '97720', '97721']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'tadek', 'brian.curtin']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue2846'
    versions = ['Python 2.6', 'Python 3.1', 'Python 2.7', 'Python 3.2']

    @tadek
    Copy link
    Mannequin Author

    tadek mannequin commented May 13, 2008

    There are cases when gzip produces/receives a zero-padded output, for
    example when creating a compressed tar archive with a pipe:

    tar cz /dev/null > foo.tgz

    ls -la foo.tgz
    -rw-r----- 1 tadek tadek 10240 May 13 23:40 foo.tgz

    tar tvfz foo.tgz
    crw-rw-rw- root/root 1,3 2007-10-18 18:27:25 dev/null

    This is a known behavior (http://www.gzip.org/#faq8) and recent versions
    of gzip handle it gracefully by skipping all zero bytes after the end of
    the file (see gzip.c:1394-1406 in the version 1.3.12).

    The Python gzip module crashes on those files:

    #:~/python2.5/py2.5$ tar cz /dev/null > foo.tgz
    tar: Removing leading `/' from member names
    #:~/python2.5/py2.5$ bin/python
    Python 2.5.2 (r252:60911, May 14 2008, 00:02:24)
    [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import gzip
    >>> f=gzip.open("foo.tgz")
    >>> f.read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/tadek/python2.5/py2.5/lib/python2.5/gzip.py", line 220, in
    read
        self._read(readsize)
      File "/home/tadek/python2.5/py2.5/lib/python2.5/gzip.py", line 263, in
    _read
        self._read_gzip_header()
      File "/home/tadek/python2.5/py2.5/lib/python2.5/gzip.py", line 164, in
    _read_gzip_header
        raise IOError, 'Not a gzipped file'
    IOError: Not a gzipped file
    >>>

    The proposed patch fixes this behavior by reading all zero characters at
    the end of the file. I tested that it works with: regular archives,
    zero-padded archives, concatenated archives and concatenated zero-padded
    archives.

    Regards,
    Tadek

    @tadek tadek mannequin added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels May 13, 2008
    @briancurtin
    Copy link
    Member

    Here tadek's patch updated for trunk, with a test added to it.

    I feel like this should be documented somewhere, but Doc/Library/gzip.rst doesn't feel right. Maybe it just needs a mention in the "What's new" or something?

    @briancurtin briancurtin self-assigned this Jan 13, 2010
    @briancurtin
    Copy link
    Member

    Updated patch with some documentation

    @pitrou
    Copy link
    Member

    pitrou commented Jan 13, 2010

    There is no need to write:

       try:
           [...]
       except IOError as err:
           self.fail(err)
    

    Just let the exception be raised and produce an error.

    @briancurtin
    Copy link
    Member

    Thanks for taking a look! Patch updated with that try/except removed.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 13, 2010

    Thank you Brian. I've committed the patch into trunk and py3k. I haven't backported it to 2.6 and 3.1, since it's more a new feature than a bug fix.

    @pitrou pitrou closed this as completed Jan 13, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants