Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error decompressing valid zlib data #52918

Closed
MatthewBrett mannequin opened this issue May 9, 2010 · 9 comments
Closed

Error decompressing valid zlib data #52918

MatthewBrett mannequin opened this issue May 9, 2010 · 9 comments
Labels
tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@MatthewBrett
Copy link
Mannequin

MatthewBrett mannequin commented May 9, 2010

BPO 8672
Nosy @gpshead, @pitrou
Files
  • mat.bin: binary zlib-compressed data causing decompression error
  • zlib-8672.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-05-11.23:39:18.028>
    created_at = <Date 2010-05-09.22:44:03.714>
    labels = ['type-bug', 'tests']
    title = 'Error decompressing valid zlib data'
    updated_at = <Date 2010-05-11.23:39:18.026>
    user = 'https://bugs.python.org/matthewbrett'

    bugs.python.org fields:

    activity = <Date 2010-05-11.23:39:18.026>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-05-11.23:39:18.028>
    closer = 'pitrou'
    components = ['Tests']
    creation = <Date 2010-05-09.22:44:03.714>
    creator = 'matthew.brett'
    dependencies = []
    files = ['17279', '17288']
    hgrepos = []
    issue_num = 8672
    keywords = ['patch']
    message_count = 9.0
    messages = ['105420', '105470', '105474', '105475', '105477', '105478', '105480', '105544', '105558']
    nosy_count = 3.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'matthew.brett']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue8672'
    versions = ['Python 2.6', 'Python 3.1', 'Python 2.7', 'Python 3.2']

    @MatthewBrett
    Copy link
    Mannequin Author

    MatthewBrett mannequin commented May 9, 2010

    I have a valid zlib compressed string, attached here as 'mat.bin' (1.7M), that cause and error on zlib.decompress decompression:

    >>> import zlib
    >>> data = open('mat.bin', 'rb').read()
    >>> out = zlib.decompress(data)
    Traceback (most recent call last):
      File "<ipython console>", line 1, in <module>
    error: Error -5 while decompressing data

    I know these data are valid, because I get the string I was expecting with:

    >> dc_obj = zlib.decompressobj()
    >> out = dc_obj.decompress(data)

    As expected, there is no remaining data after this read:

    >> assert dc_obj.flush() == ''
    >>

    I believe that the behavior of zlib.decompress(data) and zlib.decompressobj().decompress(data) should be equivalent, and that the error for zlib.decompress(data) is therefore the symptom of a bug.

    @MatthewBrett MatthewBrett mannequin added topic-IO type-bug An unexpected behavior, bug, or error labels May 9, 2010
    @pitrou pitrou added stdlib Python modules in the Lib dir and removed topic-IO labels May 9, 2010
    @pitrou
    Copy link
    Member

    pitrou commented May 10, 2010

    After a bit of debugging, it seems your data is not actually a complete zlib stream (*). What did you generate it with?

    (*) in technical terms, the zlib never returns Z_STREAM_END when decompressing your data. The decompressobj ignores it, but the top-level decompress() function considers it an error.

    @MatthewBrett
    Copy link
    Mannequin Author

    MatthewBrett mannequin commented May 10, 2010

    Hi,

    Antoine Pitrou <pitrou@free.fr> added the comment:

    After a bit of debugging, it seems your data is not actually a complete zlib stream (*). What did you generate it with?

    (*) in technical terms, the zlib never returns Z_STREAM_END when decompressing your data. The decompressobj ignores it, but the top-level decompress() function considers it an error.

    Thanks for the debugging. The stream comes from within a matlab 'mat'
    file. I maintain the scipy matlab file readers; the variables within
    these files are zlib compressed streams.

    Is there (should there be) a safe and maintained way to allow me to
    read a stream that does not return Z_STREAM_END?

    @pitrou
    Copy link
    Member

    pitrou commented May 10, 2010

    Thanks for the debugging. The stream comes from within a matlab 'mat'
    file. I maintain the scipy matlab file readers; the variables within
    these files are zlib compressed streams.

    So this would be a Matlab issue, right?

    Is there (should there be) a safe and maintained way to allow me to
    read a stream that does not return Z_STREAM_END?

    Decompressor objects allow you to do that, but I cannot tell you how
    "maintained" it is. If it has to be maintained, we could add an unit
    test for it so that regressions get detected. It would be nice if you
    could provide a very short zlib stream reproducing the issue.

    @pitrou
    Copy link
    Member

    pitrou commented May 10, 2010

    I also think we should improve the zlib module's error messages. I've added a patch in bpo-8681 for that. With that patch, the message you'd've encountered would have been "Error -5 while decompressing data: incomplete or truncated stream", which is quite more informative.

    @MatthewBrett
    Copy link
    Mannequin Author

    MatthewBrett mannequin commented May 10, 2010

    > Thanks for the debugging.  The stream comes from within a matlab 'mat'
    > file.  I maintain the scipy matlab file readers; the variables within
    > these files are zlib compressed streams.

    So this would be a Matlab issue, right?

    Yes, except scipy and numpy aim in part to be an open-source
    replacement for matlab, so we very much want to be able to read their
    files.

    >  Is there (should there be) a safe and maintained way to allow me to
    > read a stream that does not return Z_STREAM_END?

    Decompressor objects allow you to do that, but I cannot tell you how
    "maintained" it is. If it has to be maintained, we could add an unit
    test for it so that regressions get detected. It would be nice if you
    could provide a very short zlib stream reproducing the issue

    This is the only .mat file stream I have yet come across that causes
    the error. It is possible to knock a portion off the end of a valid
    stream to reproduce the problem?

    @pitrou
    Copy link
    Member

    pitrou commented May 10, 2010

    Ok, it turned out to be quite easy indeed. Here is a patch adding a test.

    @pitrou pitrou added tests Tests in the Lib/test dir and removed stdlib Python modules in the Lib dir labels May 10, 2010
    @gpshead
    Copy link
    Member

    gpshead commented May 11, 2010

    patch looks good.

    @pitrou
    Copy link
    Member

    pitrou commented May 11, 2010

    The patch was committed in r81094 (2.7), r81095 (2.6), r81096 (3.2) and r81097 (3.1). Thank you!

    @pitrou pitrou closed this as completed May 11, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants