classification
Title: EOFError in tarfile.open
Type: behavior Stage:
Components: Documentation, Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, jvoisin, ronaldoussoren
Priority: normal Keywords:

Created on 2019-12-16 14:21 by jvoisin, last changed 2019-12-17 09:45 by jvoisin.

Files
File name Uploaded Description Edit
crash-f4032ed3c7c2ae59a8f4424e0e73ce8b11ad3ef90155b008968f5b1b08499bc4 jvoisin, 2019-12-16 14:21
Messages (5)
msg358490 - (view) Author: jvoisin (jvoisin) Date: 2019-12-16 14:21
The attached file produces the following stacktrace when opened via `tarfile.open`, on Python 3.7.5rc1:


```
$ cat tarrepro.py 
import tarfile
import sys

with tarfile.open(sys.argv[1], errorlevel=2) as t:
  for member in t.getmembers():
    pass
$
```

```
$ python3 tarrepro.py crash-f4032ed3c7c2ae59a8f4424e0e73ce8b11ad3ef90155b008968f5b1b08499bc4
Traceback (most recent call last):
  File "tarrepro.py", line 4, in <module>
    with tarfile.open(sys.argv[1], errorlevel=2) as t:
  File "/usr/lib/python3.7/tarfile.py", line 1574, in open
    return func(name, "r", fileobj, **kwargs)
  File "/usr/lib/python3.7/tarfile.py", line 1646, in gzopen
    t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.7/tarfile.py", line 1622, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.7/tarfile.py", line 1485, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python3.7/tarfile.py", line 2290, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.7/tarfile.py", line 1094, in fromtarfile
    buf = tarfile.fileobj.read(BLOCKSIZE)
  File "/usr/lib/python3.7/gzip.py", line 276, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.7/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.7/gzip.py", line 463, in read
    if not self._read_gzip_header():
  File "/usr/lib/python3.7/gzip.py", line 421, in _read_gzip_header
    self._read_exact(extra_len)
  File "/usr/lib/python3.7/gzip.py", line 400, in _read_exact
    raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

```
msg358492 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-12-16 15:12
Looks like expected behaviour, the attached file is an incomplete compressed file that does not seem to contain data (according to gzcat)

gzcat: crash-f4032ed3c7c2ae59a8f4424e0e73ce8b11ad3ef90155b008968f5b1b08499bc4: unexpected end of file
gzcat: crash-f4032ed3c7c2ae59a8f4424e0e73ce8b11ad3ef90155b008968f5b1b08499bc4: uncompress failed
msg358493 - (view) Author: jvoisin (jvoisin) Date: 2019-12-16 15:17
Unfortunately, the documentation ( https://docs.python.org/3/library/tarfile.html) doesn't mention that EOFError is an exception that could be raised when using tarfile.open :/
msg358494 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2019-12-16 15:25
The stdlib documentation does in general not contain exhaustive documentation on exceptions that might be raised.
msg358540 - (view) Author: jvoisin (jvoisin) Date: 2019-12-17 09:45
Does it means that the right™ way to process untrusted tar files is
to wrap every call to functions from tarfile.py in a `try: … except Exception:` block?
History
Date User Action Args
2019-12-17 09:45:27jvoisinsetmessages: + msg358540
2019-12-16 15:25:19ronaldoussorensetnosy: + docs@python
messages: + msg358494

assignee: docs@python
components: + Documentation
2019-12-16 15:17:14jvoisinsetmessages: + msg358493
2019-12-16 15:12:02ronaldoussorensetnosy: + ronaldoussoren
messages: + msg358492
2019-12-16 14:21:45jvoisincreate