classification
Title: ValueError in TarFile.getmembers
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jvoisin, lars.gustaebel, serhiy.storchaka, terry.reedy
Priority: normal Keywords:

Created on 2019-12-16 10:43 by jvoisin, last changed 2019-12-23 16:45 by jvoisin.

Files
File name Uploaded Description Edit
crash-7221297307ab37ac87be6ea6dd9b28d4d453c557aa3da8a2138ab98e015cd42a jvoisin, 2019-12-16 10:43
Messages (5)
msg358472 - (view) Author: jvoisin (jvoisin) Date: 2019-12-16 10:43
The attached file produces the following stacktrace when opened via `tarfile.open`  and iterated with `TarFile.getmembers`, on Python 3.7.5rc1:

```
$ cat tarrepro.py
import tarfile
import sys

with tarfile.open(sys.argv[1]) as t:
  for member in t.getmembers():
    pass
```

```
$ python3 tarrepro.py crash-7221297307ab37ac87be6ea6dd9b28d4d453c557aa3da8a2138ab98e015cd42a
Traceback (most recent call last):
  File "tarrepro.py", line 5, in <module>
    for member in t.getmembers():
  File "/usr/lib/python3.7/tarfile.py", line 1763, in getmembers
    self._load()        # all members, we first have to
  File "/usr/lib/python3.7/tarfile.py", line 2350, in _load
    tarinfo = self.next()
  File "/usr/lib/python3.7/tarfile.py", line 2281, in next
    self.fileobj.seek(self.offset - 1)
ValueError: cannot fit 'int' into an offset-sized integer
```

This file isn't a valid tar file, it was created by a fuzzer.
msg358737 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-20 20:41
See #39065, #39067 for similar tarfile issues.
msg358738 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-20 20:52
jvoisin, what do you consider to be the bug?  Raising an exception  is exactly the right thing to do on bad input.  I leave it to others to decide if this should be closed as 'not a bug' or if the internal exception should be caught and replaced.  We don't pretend to document all possible exception from all functions.

The more important aim of fuzzing is to find inputs that cause no-exception crashes.
msg358739 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-20 20:56
jvoisin, please consider rerunning such reproducers with lastest 3.8 and 3.9 before submitting.  It is much easier for you to do so when you have the fuzz file, script, and command line already present.
msg358827 - (view) Author: jvoisin (jvoisin) Date: 2019-12-23 16:45
Raising an except is ok, if it's documented, so I know which ones I should catch to prevent my program to quit when processing untrusted files, without having to catch `Exception`.

Reliability is important in my use-case as well, not only exploitable memory-corruption issues.

I'll try to reproduce future issues on more recent Python versions before reporting them :)
History
Date User Action Args
2019-12-23 16:45:45jvoisinsetmessages: + msg358827
2019-12-20 20:56:53terry.reedysetmessages: + msg358739
2019-12-20 20:52:47terry.reedysetmessages: + msg358738
2019-12-20 20:41:37terry.reedysetnosy: + terry.reedy
messages: + msg358737
2019-12-20 20:39:31terry.reedysetnosy: + lars.gustaebel, serhiy.storchaka
2019-12-16 10:43:36jvoisincreate