This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [PATCH] zipfile.ZipFile does not extract directories properly
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, faw, ggenellina, koen, loewis
Priority: release blocker Keywords: needs review, patch

Created on 2008-12-21 13:10 by faw, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
zipfile_extract_dirs.patch faw, 2008-12-21 13:10 patch
zipfile_dirs_support.patch faw, 2008-12-23 17:10 patch r2
dir.diff loewis, 2009-01-05 22:39 r3
zipdir.zip loewis, 2009-01-05 22:40 Extraction test data (goes into Lib/test)
Messages (7)
msg78143 - (view) Author: Kuba Wieczorek (faw) Date: 2008-12-21 13:10
This behaviour has been known of course for quite long time. I suppose
this is not intentional so I've played a bit with this and I hope you'll
consider some little change.

Currently, if a ZIP archive contains some subdirectories then
zipfile.ZipFile.extract()/extractall() will create files instead of
directories in the target location. This of course will lead some
scripts to crash (unless any work-around has been done) because files
from the subdirectories couldn't be created.

Attached is a patch against current 2.7 tree. Applied, will make
extractall() extract properly all the contents of the archive with
proper tree structure. If a directory name is passed to the extract(),
it will only create the directory itself without the contents (I guess
it is obvious).
msg78158 - (view) Author: Koen van de Sande (koen) Date: 2008-12-21 22:33
I'm no expert, but is it possible for ZIP files to have Windows-style 
path seperators ('\') as well?
And is this new behavior desirable for existing code as well? It might 
break existing applications, so perhaps a new extractrecursive() 
function is more intuitive.
Finally, I've noticed that the patch does not add any tests to test for 
the new/old behavior.
msg78160 - (view) Author: Kuba Wieczorek (faw) Date: 2008-12-21 23:03
I'm not sure if it would make sense to save current
extract()/extractall() behaviour and implement new with another function
because current one is simply faulty. And if it's about BC breaks then
well... it may be introduced in 3.0 line, I think that leaving faulty
behaviour and introducing another with another function misses the point.

AFAIK there'd be no problem with Windows-style paths but to be sure I
revised my patch.

Yes, I'm sorry for the lack of tests, it is my first patch and I forgot
about this. I'll write them in a couple of hours.
msg78233 - (view) Author: Kuba Wieczorek (faw) Date: 2008-12-23 13:08
Here is a revised version of the patch with directories support for
write(). It works like UNIX zip program so if a directory path is passed
to the function, it stores only the directory itself (without the contents).

This time without tests as well because I don't want to mess up with
them until I finish my patch completely.
msg78341 - (view) Author: Gabriel Genellina (ggenellina) Date: 2008-12-27 05:33
Your usage of os.sep is incorrect, both when reading and writing 
directories.

Zip files are (more-or-less) platform independent. The specification 
*requires* forward slashes in paths [1], and the zipfile module 
already writes them that way. Checking for os.sep is wrong - at least 
on Windows. 

I've never encountered malformed entries of that kind (like "directory
\" instead of "directory/") but if you want to suport such beasts, 
check for "/" *and* os.sep explicitely.

[1] See APPNOTE.TXT (there is a link near the top of zipfile.py)
msg79204 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-05 22:39
Here is a version of the patch that rationalizes usage of os.sep, fixes
a few bugs with r2, and adds a test case.
msg80450 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-01-24 14:18
Committed as r68885, r68886, r68887, and r68888
History
Date User Action Args
2022-04-11 14:56:43adminsetnosy: + benjamin.peterson
github: 48960
2009-01-24 14:18:42loewissetstatus: open -> closed
resolution: fixed
messages: + msg80450
2009-01-05 22:40:36loewissetfiles: + zipdir.zip
2009-01-05 22:39:39loewissetkeywords: + needs review
files: + dir.diff
messages: + msg79204
nosy: + loewis
2008-12-30 12:15:10loewissetpriority: release blocker
2008-12-27 05:33:16ggenellinasetnosy: + ggenellina
messages: + msg78341
2008-12-25 16:05:02fawsetfiles: - zipfile_extract_dirs_r2.patch
2008-12-23 17:10:32fawsetfiles: + zipfile_dirs_support.patch
2008-12-23 17:09:08fawsetfiles: - zipfile_dirs_support.patch
2008-12-23 13:08:43fawsetfiles: + zipfile_dirs_support.patch
messages: + msg78233
2008-12-21 23:03:28fawsetfiles: + zipfile_extract_dirs_r2.patch
messages: + msg78160
2008-12-21 22:33:42koensetnosy: + koen
messages: + msg78158
2008-12-21 13:10:10fawcreate