Title: zlib.error with
Type: behavior
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
Status: closed Resolution: fixed
Nosy List: christian.heimes, ethan.furman, jack__d, jvoisin, lukasz.langa
Keywords: patch

Created on 2019-12-13 16:00 by jvoisin, last changed 2022-04-11 14:59 by admin. This issue is now closed.

crash-c10c9839d987fa0df6912cb4084f43f3ce08ca82 jvoisin, 2019-12-13 16:00
Author: jvoisin Date: 2019-12-13 16:00
The attached file produces the following stacktrace when opened via ``, on Python 3.7.5rc1:

$ cat 
import sys
import tarfile[1])
$ python3 ./crash-c10c9839d987fa0df6912cb4084f43f3ce08ca82
Traceback (most recent call last):
  File "", line 4, in <module>[1])
  File "/usr/lib/python3.7/", line 1573, in open
    return func(name, "r", fileobj, **kwargs)
  File "/usr/lib/python3.7/", line 1645, in gzopen
    t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.7/", line 1621, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.7/", line 1484, in __init__
    self.firstmember =
  File "/usr/lib/python3.7/", line 2289, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.7/", line 1094, in fromtarfile
    buf =
  File "/usr/lib/python3.7/", line 276, in read
  File "/usr/lib/python3.7/", line 68, in readinto
    data =
  File "/usr/lib/python3.7/", line 471, in read
    uncompress = self._decompressor.decompress(buf, size)
zlib.error: Error -3 while decompressing data: invalid distances se
Author: Christian Heimes Date: 2019-12-13 16:34
This file is also an invalid tar file:

$ tar xf crash-c10c9839d987fa0df6912cb4084f43f3ce08ca82 

gzip: stdin: invalid compressed data--format violated
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Author: jvoisin Date: 2019-12-13 16:38
Sure, but as a user, I would expect a better exception, like ValueError or ReadError, along with an error message, instead of an unexpected zlib exception.
Author: Jack DeVries Date: 2021-08-18 00:42
@jvoisin I am able to reproduce the problem when I download your script, but I am having a hard time reproducing it by passing corrupt archives to ``. How exactly was this file corrupted? I am trying to figure out if there are any similar implementation leaks / poor error messages in similar scenarios so I can do my best to patch them all.

You can see the reproduction scripts I am using here to get a better idea of what I have been trying. Be forewarned, they are pretty gnarly!
Author: jvoisin Date: 2021-08-20 18:44
The file was created with a fuzzer, like the one described in
Author: Łukasz Langa Date: 2021-09-29 09:25
New changeset b6fe8572509b77d2002eaddf99d718e9b4835684 by Jack DeVries in branch 'main':
bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766)
Author: Łukasz Langa Date: 2021-09-29 10:19
New changeset d6b69f21d8ec4af47a9c79f3f50d20be3d0875fc by Łukasz Langa in branch '3.10':
[3.10] bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766) (GH-28613)
Author: Łukasz Langa Date: 2021-09-29 10:56
New changeset 7bff4d396f20451f20977be3ce23a879c6bc3e46 by Łukasz Langa in branch '3.9':
[3.9] bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766) (GH-28614)
Author: Łukasz Langa Date: 2021-09-29 10:58
Thanks for the fix, Jack! ✨ 🍰 ✨  

Since the change translated `zlib.error` to `tarfile.ReadError` which already has to be handled by user code, it's strictly decreasing the surface of necessary exception handling. So, treating this as a bug fix, I backported this to 3.9 and 3.10 as well.
