classification
Title: tarfile: Do not write full path in FNAME field
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ArtemSBulgakov, eamanu, ethan.furman, lars.gustaebel, miss-islington
Priority: normal Keywords: patch

Created on 2020-07-16 18:37 by ArtemSBulgakov, last changed 2020-10-21 08:10 by methane. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 21511 merged ArtemSBulgakov, 2020-07-16 18:42
PR 22140 merged miss-islington, 2020-09-07 16:46
PR 22141 merged miss-islington, 2020-09-07 16:46
Messages (6)
msg373759 - (view) Author: Artem Bulgakov (ArtemSBulgakov) * Date: 2020-07-16 18:37
tarfile sets FNAME field to the path given by user: Lib/tarfile.py:424

It writes full path instead of just basename if user specified absolute path. Some archive viewer apps like 7-Zip may process file incorrectly. Also it creates security issue because anyone can know structure of directories on system and know username or other personal information.

You can reproduce this by running below lines in Python interpreter. Tested on Windows and Linux.

Python 3.8.2 (default, Apr 27 2020, 15:53:34)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> import tarfile
>>> open("somefile.txt", "w").write("sometext")
8
>>> tar = tarfile.open("/home/bulgakovas/file.tar.gz", "w|gz")
>>> tar.add("somefile.txt")
>>> tar.close()
>>> open("file.tar.gz", "rb").read()[:50]
b'\x1f\x8b\x08\x08cE\x10_\x02\xff/home/bulgakovas/file.tar\x00\xed\xd3M\n\xc20\x10\x86\xe1\xac=EO\x90'

You can see full path to file.tar (/home/bulgakovas/file.tar) as FNAME field. If you will write just tarfile.open("file.tar.gz", "w|gz"), FNAME will be equal to file.tar.

RFC1952 says about FNAME:
This is the original name of the file being compressed, with any directory components removed.

So tarfile must remove directory names from FNAME and write only basename of file.
msg373790 - (view) Author: Emmanuel Arias (eamanu) * Date: 2020-07-17 01:55
Hi,

If I understand correctly, the name that you are using into the tar
is the basename of the file. I didn't test it yet, but this PR will
remove the possibility to create a file into the tar using the
source tree folder?

Maybe we can think about implement a parameter seems like arcname
on Zipfile?

What about that?

Cheers!
msg373798 - (view) Author: Artem Bulgakov (ArtemSBulgakov) * Date: 2020-07-17 07:23
Hi. My PR doesn't remove the possibility to add tree into tar file. It only fixes header for GZIP compression. Any data after this header is not affected.

You can test it by creating two archives with the same data but one with my patch and the second without. All bytes after header are equal.
msg376515 - (view) Author: miss-islington (miss-islington) Date: 2020-09-07 16:46
New changeset 22748a83d927d3da1beaed771be30887c42b2500 by Artem Bulgakov in branch 'master':
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
https://github.com/python/cpython/commit/22748a83d927d3da1beaed771be30887c42b2500
msg379194 - (view) Author: miss-islington (miss-islington) Date: 2020-10-21 05:29
New changeset 7917170c5b4793ca9443f753aaecb8fbb3ad54ef by Miss Skeleton (bot) in branch '3.9':
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
https://github.com/python/cpython/commit/7917170c5b4793ca9443f753aaecb8fbb3ad54ef
msg379195 - (view) Author: miss-islington (miss-islington) Date: 2020-10-21 05:29
New changeset e866f33a48ee24e447fafd181f0da5f9584e0340 by Miss Skeleton (bot) in branch '3.8':
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
https://github.com/python/cpython/commit/e866f33a48ee24e447fafd181f0da5f9584e0340
History
Date User Action Args
2020-10-21 08:10:09methanesetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: - Python 3.5, Python 3.6, Python 3.7
2020-10-21 05:29:53miss-islingtonsetmessages: + msg379195
2020-10-21 05:29:09miss-islingtonsetmessages: + msg379194
2020-09-07 16:46:57miss-islingtonsetpull_requests: + pull_request21225
2020-09-07 16:46:48miss-islingtonsetpull_requests: + pull_request21224
2020-09-07 16:46:40miss-islingtonsetnosy: + miss-islington
messages: + msg376515
2020-07-17 07:23:42ArtemSBulgakovsetmessages: + msg373798
2020-07-17 06:47:58xtreaksetnosy: + ethan.furman
2020-07-17 01:55:16eamanusetnosy: + eamanu
messages: + msg373790
2020-07-16 18:42:37ArtemSBulgakovsetkeywords: + patch
stage: patch review
pull_requests: + pull_request20646
2020-07-16 18:37:23ArtemSBulgakovcreate