Author markgrandi
Recipients markgrandi
Date 2014-08-15.23:29:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1408145357.61.0.947327827691.issue22208@psf.upfronthosting.co.za>
In-reply-to
Content
So I ran into this problem today, where near impossible to create a tarfile.TarFile object, then add files to the archive, when the files are in memory file-like objects (like io.BytesIO, io.StringIO, etc)

code example:

###################
import tarfile, io

tarFileIo = io.BytesIO()

tarFileObj = tarfile.open(fileobj=tarFileIo, mode="w:xz")

fileToAdd = io.BytesIO("hello world!".encode("utf-8"))

# fixes "AttributeError: '_io.BytesIO' object has no attribute 'name'"
fileToAdd.name="helloworld.txt"

# fails with 'io.UnsupportedOperation: fileno'
tarInfo = tarFileObj.gettarinfo(arcname="helloworld.txt", fileobj=fileToAdd)

# never runs
tarFileObj.addfile(tarInfo, fileobj=fileToAdd)
###################

this was previously reported as this bug: http://bugs.python.org/issue10369 but I am unhappy with the resolution of "its not a bug", and the 'hack' that Lars posted as a solution. My reasons:

1: The zipfile module supports writing in memory files / bytes , using the following code (which is weird but it works)

tmp = zipfile.ZipFile("tmp.zip", mode="w")
import io
x = io.BytesIO("hello world!".encode("utf-8"))
tmp.writestr("helloworld.txt", x.getbuffer())
tmp.close()

2: the 'hack' that Lars posted, while it works, this is unintuitive and confusing, and isn't the intended behavior. What happens if your script is cross platform, what file do you open to give to os.stat()? In the code posted it uses open('/etc/passwd/') for the fileobj parameter to gettarinfo(), but that file doesn't exist on windows, now not only are you doing this silly hack, you have to have code that checks platform.system() to get a valid file that is known to exist for every system, or use sys.executable, except the documentation for that says it can return None or an empty string.

3: it is easy to fix (at least to me), in tarfile.gettarinfo(), if fileobj is passed in, and it doesn't have a fileno, then to create the TarInfo object, you set 'name' to be the arcname parameter, size = len(fileobj), then have default (maybe overridden by keyword args to gettarinfo()) values for uid/gid/uname/gname.

On a random tar.gz file that I downloaded from sourceforge, the uid/gid are '500' (when my gid is 20 and uid is 501), and the gname/uname are just empty strings. So its obvious that those don't matter most of the time, and when they do matter, you can modify the TarInfo object after creation or pass in values for them in a theoretical keywords argument to gettarinfo().

If no one wants to code this I can provide a patch, I just want the previous bug report's status of "not a bug" to be reconsidered.
History
Date User Action Args
2014-08-15 23:29:17markgrandisetrecipients: + markgrandi
2014-08-15 23:29:17markgrandisetmessageid: <1408145357.61.0.947327827691.issue22208@psf.upfronthosting.co.za>
2014-08-15 23:29:17markgrandilinkissue22208 messages
2014-08-15 23:29:17markgrandicreate