This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [Security] tarfile: Add absolute_path option to tarfile, disabled by default
Type: security Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, jwilk, lars.gustaebel, martin.panter, vstinner
Priority: normal Keywords:

Created on 2017-03-10 16:13 by vstinner, last changed 2022-04-11 14:58 by admin.

Messages (2)
msg289388 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-10 16:13
I noticed that "python3 -m tarfile -x archive.tar" uses absolute paths by default, whereas the UNIX tar command doesn't by default. The UNIX tar command requires to add explicitly --absolute-paths (-P) option.

I suggest to add a boolean absolute_path option to tarfile, disabled by default.

Example to create such archive. See that tar also removes "/" by default and requires to pass explicitly -P:

$ cd $HOME
# /home/haypo
$ echo TEST > test
$ tar -cf test.tar /home/haypo/test
tar: Removing leading `/' from member names

$ rm -f test.tar
$ tar -P -cf test.tar /home/haypo/test
$ rm -f test


Extracting such archive using tar is safe *by default*:

$ mkdir z
$ cd z
$ tar -xf ~/test.tar
tar: Removing leading `/' from member names
$ find
.
./home
./home/haypo
./home/haypo/test


Extracting such archive using Python is unsafe:

$ python3 -m tarfile -e ~/test.tar
$ cat ~/test
TEST
$ pwd
/home/haypo/z

Python creates files outside the current directory which is unsafe, wheras tar doesn't.
msg289437 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-03-11 05:27
The CLI was added in Issue 13477. I didn’t see any discussion of traversal attacks there, so maybe it was overlooked. Perhaps there should also be a warning, like with the Tarfile.extract and “extractall” methods.

However I did see one of the goals was to keep the CLI simple, which I agree with. I would suggest that the CLI get this proposed behaviour by default (matching the default behaviour of modern “tar” commands), with no option to restore the current less-robust behaviour.

To implement it, I suggest to fix the module internals first: Issue 21109 and/or Issue 17102.

FWIW BSD calls the option “--absolute-paths” (plural paths) <https://www.freebsd.org/cgi/man.cgi?tar%281%29#OPTIONS>, while Gnu calls it “--absolute-names” <https://www.gnu.org/software/tar/manual/html_chapter/tar_6.html#SEC121>. Both these options disable other checks, such as for parent directories (..) and external symbolic link targets, so I think the term “absolute” is too specific. But please use at least replace the underscore with a dash or hyphen: “--absolute-path”, not “--absolute_path”.
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 73974
2021-05-21 23:25:06ned.deilysetversions: + Python 3.11, - Python 3.7
2018-06-01 17:08:56jwilksetnosy: + jwilk
2018-05-29 22:49:57vstinnersettitle: tarfile: Add absolute_path option to tarfile, disabled by default -> [Security] tarfile: Add absolute_path option to tarfile, disabled by default
2017-03-22 09:34:34berker.peksagsetnosy: + berker.peksag
2017-03-11 05:29:40martin.panterlinkissue21109 dependencies
2017-03-11 05:27:55martin.pantersetnosy: + martin.panter
messages: + msg289437
2017-03-10 21:30:58ned.deilysetnosy: + lars.gustaebel
2017-03-10 16:13:58vstinnersetcomponents: + Library (Lib)
title: Add absolute_path option to tarfile, disabled by default -> tarfile: Add absolute_path option to tarfile, disabled by default
2017-03-10 16:13:44vstinnercreate