classification
Title: zipfile: symlinks etc.
Type: enhancement Stage: needs patch
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jcea, ronaldoussoren, serhiy.storchaka, takluyver, xcombelle
Priority: normal Keywords:

Created on 2013-07-30 07:45 by ronaldoussoren, last changed 2016-01-15 15:20 by takluyver.

Messages (5)
msg193915 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-07-30 07:45
The zipfile format (as described by .zip file format specification) allows for storing extra unix data, such as symlinks and device nodes in zipfile.

Storing at least symlinks would be useful, and is supported by the infozip tools as well (the command-line zip and unzip commands on Linux systems).

An implementation would use the "UNIX Extra Field (0x000d)" to store this information.
msg194234 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2013-08-03 03:41
Ronald, could you try to write a patch?
msg194240 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-08-03 08:18
My initial plan was to add the patch soon after filing the issue, but that's before I noticed that this needs some API design to integrate nicely :-)

My current idea for the api:

* add "symlink(path, target") to write a symlink

* add "readlink(path)" to read a symlink

* "read" will raise an exception when trying to read a symlink
  (alternative: do symlink resolving, but that's too magical to my taste)
 
* "extract" and "extractall" extract the symlink as a symlink
  (but I'm not sure yet what to do on systems that don't support symlinks)

* with the various file types it might be better to also provide
  "islink(name)", "isdir(name)" and "isfile(name)" methods (simular to
  their os.path equivalents)

This will also require some changes to the ZipInfo class.

I'm not sure yet if adding support for device files and other unix attributes (UID/GID).
msg194264 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-03 14:30
> * "read" will raise an exception when trying to read a symlink
>  (alternative: do symlink resolving, but that's too magical to my taste)

And perhaps when trying to read a directory entry too.

> * "extract" and "extractall" extract the symlink as a symlink
>  (but I'm not sure yet what to do on systems that don't support symlinks)

What the tar module do?

> * with the various file types it might be better to also provide
>  "islink(name)", "isdir(name)" and "isfile(name)" methods (simular to
>  their os.path equivalents)

Or rather as methods of the ZipInfo object. See TarInfo.
msg252482 - (view) Author: Xavier Combelle (xcombelle) * Date: 2015-10-07 18:35
about the readlink functionnality, I would like to point that it might lead to security issues see for example https://security.stackexchange.com/questions/73718/how-zip-symlink-works

At least, the standard read should not do it by default.
History
Date User Action Args
2016-01-15 15:20:56takluyversetnosy: + takluyver
2015-10-07 18:35:56xcombellesetnosy: + xcombelle
messages: + msg252482
2014-08-19 17:37:14serhiy.storchakasetversions: + Python 3.5, - Python 3.4
2013-08-03 14:30:58serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg194264
2013-08-03 08:18:14ronaldoussorensetmessages: + msg194240
2013-08-03 03:41:06jceasetnosy: + jcea

messages: + msg194234
stage: test needed -> needs patch
2013-07-30 07:45:18ronaldoussorencreate