Title: zipfile.writestr doesn't set external attributes, so files are extracted mode 000 on Unix
Components: Extension Modules Versions: Python 2.6, Python 2.5
Nosy List: cbrannon, dkbg, mark, pitrou, swarren
Priority: high

writestr_usable_permissions.diff cbrannon, 2008-07-18 06:15
msg69898 - (view) Author: Stephen Warren (swarren) Date: 2008-07-17 19:01
Run the following Python script, on Unix/Linux:

import zipfile

z = zipfile.ZipFile('', 'w')
z.writestr('filebad.txt', 'Some content')

z = zipfile.ZipFile('', 'w')
zi = zipfile.ZipInfo('filegood.txt')
zi.external_attr = 0660 << 16L
z.writestr(zi, 'Some content')

Like this:

python  && unzip && unzip && ls -l

You'll see:

----------  1 swarren swarren   12 2008-07-17 12:54 filebad.txt
-rw-rw----  1 swarren swarren   12 1980-01-01 00:00 filegood.txt

Note that filebad.txt is extracted with mode 000.

The WAR (used for filegood.txt) is to pass writestr a ZipInfo class with
external_attr pre-initialized. However, writestr should perform this
assignment itself, to be consistent with write. I haven't checked, but
there's probably a bunch of other stuff in write that writestr should do
msg69899 - (view) Author: Stephen Warren (swarren) Date: 2008-07-17 19:02
Oops. Forgot to set "type" field.
msg69922 - (view) Author: Christopher Brannon (cbrannon) Date: 2008-07-17 22:46
What value should the new archive entry's external_attr attribute have?
ZipFile.write sets it to the permissions of the file being archived, but
writestr is archiving a string, not a file.  Setting it to 0600 &lt;&lt; 16
seems reasonable.

Stephen's script exposed a second bug in writestr.  When passed a name
rather than a ZipInfo instance, the new archive member receives a timestamp
of 01/01/1980.  However, the docs say that the timestamp should correspond to
the current date and time.
ZipFile.writestr begins with the following code:

    def writestr(self, zinfo_or_arcname, bytes):
        """Write a file into the archive.  The contents is the string
        'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or
        the name of the file in the archive."""
        if not isinstance(zinfo_or_arcname, ZipInfo):
            zinfo = ZipInfo(filename=zinfo_or_arcname,
            zinfo.compress_type = self.compression

The "date_time=" line should read:

msg69932 - (view) Author: Stephen Warren (swarren) Date: 2008-07-18 03:57
I'd probably argue for at least 0660<<16, if not 0666<<16, since group
permissions are pretty typically set, but even 0666<<16 would be OK,
since the umask on extraction would take away any permissions the
extracting user didn't want.

But, as long as the chosen mask includes at least 0600, I'd consider the
issue fixed.
msg69937 - (view) Author: Christopher Brannon (cbrannon) Date: 2008-07-18 06:15
Here is a patch containing code and a unit test.  I set external_attr
to 0600, for the following reason.
When I extract with Infozip, my umask is ignored when setting permissions of
extracted entries.  They have the permissions assigned to them when archived.
tar does respect umask, but it's not pertinent.
The following shell script demonstrates Infozip's behavior:

mkdir ziptest_dir
echo hello > ziptest_dir/foo.txt
chmod 666 ziptest_dir/foo.txt
zip -r ziptest_dir/
rm -rf ziptest_dir
umask 077

Setting permissions to 0600 seems like the safest course.

I'm not sure if this patch should be accompanied by some documentation,
since the zipfile docs don't say much about external_attr or permissions.

PS.  My earlier comments about timestamps were incorrect and spurious!
msg70245 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-25 09:57
Agree with using 0600 as default permissions.
msg70271 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-07-25 19:44
Committed in r65235. Thanks!
