|
msg92154 - (view) |
Author: Ross (rossmclendon) |
Date: 2009-09-02 04:42 |
It would be most helpful if a method could be included in the TarFile
class of the tarfile module and the ZipFile class of the zipfile module
that would remove a particular file (either given by a name or a
TarInfo/ZipInfo object) from the archive.
Usage to remove a single file from an archive would be as follows:
import zipfile
zipFileObject = zipfile.ZipFile(archiveName,'a')
zipFileObject.remove(fileToRemove)
zipFileObject.close()
Such a method should probably only apply to archives that are in append
mode as write mode would erase the original archive and read mode should
render the archive immutable.
One possible extra to be included is to allow a list of file names or
ZipInfo/TarInfo objects to be passed into the remove method, such that
all items in the list would be removed from the archive.
|
|
msg92155 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2009-09-02 05:20 |
+1
|
|
msg92156 - (view) |
Author: Ross (rossmclendon) |
Date: 2009-09-02 05:42 |
Slight change to:
"Such a method should probably only apply to archives that are in append
mode as write mode would erase the original archive and read mode should
render the archive immutable."
The method should probably still apply to an archive in write mode. It
is conceivable that one may need to delete a file from the archive after
it has been written but before the archive object has been closed.
|
|
msg92158 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2009-09-02 07:00 |
-1. I don't think this can be implemented in a reasonable way, assuming
that you want the file to become smaller as a consequence of removal.
|
|
msg92164 - (view) |
Author: Lars Gustäbel (lars.gustaebel) *  |
Date: 2009-09-02 12:11 |
-1, although I can only speak for tarfile. Removing members from a tar
archive sounds obvious and easy but it is *not*. A file in an archive is
stored as a header block (that contains the metadata) followed by a
number of data blocks (that contain the file's data). New files are
simply appended to the archive file. There is no central table of
contents whatsoever. To make things worse, a compressed archive is
compressed in one go from the beginning right up to the end, it is not
possible to access a member in the middle of an archive without having
to decompress all data before it.
Deleting files from an uncompressed archive is rather straightforward
implementation-wise but IO intensive and risky. In contrast, there is no
other way to delete files from a *compressed* tarfile than to make a
copy of it omitting the unwanted files.
|
|
msg92172 - (view) |
Author: Ross (rossmclendon) |
Date: 2009-09-02 16:40 |
In light of Lars's comment on the matter, perhaps this functionality
could be added to zip files only. Surely it can be done, considering
that numerous utilities and even Windows Explorer provide such
functionality. I must confess that I am unfamiliar with the inner
workings of file archives and compression, but seeing as it is
implemented in a number of places already, it seems logical that it
could be implemented in ZipFile as well. I'll spend some time the next
few days educating myself about zip files and how this might be
accomplished.
|
|
msg92177 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2009-09-02 18:54 |
> In light of Lars's comment on the matter, perhaps this functionality
> could be added to zip files only. Surely it can be done, considering
> that numerous utilities and even Windows Explorer provide such
> functionality.
Are you sure they are not creating a new file in order to delete
content? I recall that early zip tools (e.g. pkzip) had a mode
where they would merely delete the entry from the directory, but
leave the actual data in the file. Would you consider that a correct
implementation? If so, *that* can be done, for zipfiles, AFAIU.
> I'll spend some time the next
> few days educating myself about zip files and how this might be
> accomplished.
Please do - you'll find that deletion from zipfiles comes in a can
full of worms.
|
|
msg94475 - (view) |
Author: victorlee129 (victorlee129) |
Date: 2009-10-26 07:15 |
I done it In a very *violent* way.
Is it ok for you thought?
if so, would anybody please fix it into the lib?
|
|
msg94478 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2009-10-26 07:58 |
> I done it In a very *violent* way.
> Is it ok for you thought?
In the form in which you have done it, it is clearly
unacceptable for inclusion in the library: we don't
want to add two modules "delete" and "classtools".
In addition, notice that code is for tarfile, whereas
the OP was asking for a similar feature for zipfile.
> if so, would anybody please fix it into the lib?
This is not how this works. If you want us to take
action, please submit a complete and correct patch.
|
|
msg109158 - (view) |
Author: Troy Potts (chroipahtz) |
Date: 2010-07-03 05:26 |
I have attempted to implement a ZipFile.remove function. It seems to work fine. I have submitted a patch.
The method of implementation is: find the file's index in the file list, then sum the lengths of the file entries before it to find its location in the archive. Then simply read in all the bytes after it, write them out at that location, and truncate the file x bytes shorter, where x is the length of the record. This works because the directory listing is created when the file is closed, so there's no harm in truncating.
I've also made it truncate the zip file after reading in the existing files upon creation, because the directory information is not used after this point.
This could use some testing on large files.
This is my first patch, so let me know if I've done anything wrong.
|
|
msg109160 - (view) |
Author: Troy Potts (chroipahtz) |
Date: 2010-07-03 05:47 |
My patch had some bugs, I'll need to do some more testing. Sorry about that.
|
|
msg130299 - (view) |
Author: Yuval Greenfield (ubershmekel) * |
Date: 2011-03-08 00:10 |
What's the status with this patch? If nobody's looking at it I can try to see if it works and write the test and documentation for it.
|
|
msg130388 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2011-03-09 00:21 |
Please feel free to test, revise, and write. Though 'removed', the file is still accessible via the history list. (Click 'zipfile_remove.patch' and then 'download'.)
|
|
msg130463 - (view) |
Author: Yuval Greenfield (ubershmekel) * |
Date: 2011-03-09 20:54 |
I fixed the bugs I found, added tests and documentation. What do you guys think?
|
|
msg130938 - (view) |
Author: Yuval Greenfield (ubershmekel) * |
Date: 2011-03-15 01:08 |
Fixed the bugs Martin pointed out and added the relevant tests. Sadly I had to move some stuff around, but I think the changes are all for the better. I wasn't sure about the right convention for the 2 constants I added btw.
|
|
msg140680 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2011-07-19 16:07 |
Martin did a review of the newer patch; maybe you didn’t get the mail (there’s a Rietveld bug when a user name without email is given to the Cc field).
|
|
msg159418 - (view) |
Author: Yuval Greenfield (ubershmekel) * |
Date: 2012-04-26 19:44 |
I'm not sure I understand how http://bugs.python.org/review/6818/show works. I've looked all over and only found remarks for "zipfile.remove.patch" and not for "zipfile.remove.2.patch" which addressed all the aforementioned issues.
Also, I don't understand how to add myself to the CC of this issue's review page.
|
|
msg160533 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-05-13 17:34 |
Yuval, can you please submit a contributor agreement? See
http://www.python.org/psf/contrib/
|
|
msg160534 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-05-13 17:36 |
As for adding yourself to the CC list: notice the string "ubershmekel" appearing in the "CC" field of http://bugs.python.org/review/6818/show. It means that you are already on the CC list.
|
|
| Date |
User |
Action |
Args |
| 2013-04-11 14:39:21 | Arthur.Darcet | set | nosy:
+ Arthur.Darcet
|
| 2012-05-13 17:36:56 | loewis | set | messages:
+ msg160534 |
| 2012-05-13 17:34:44 | loewis | set | messages:
+ msg160533 |
| 2012-04-26 19:44:56 | ubershmekel | set | messages:
+ msg159418 |
| 2011-07-19 16:07:53 | eric.araujo | set | messages:
+ msg140680 |
| 2011-03-19 05:07:47 | terry.reedy | link | issue11415 superseder |
| 2011-03-15 01:08:09 | ubershmekel | set | files:
+ zipfile.remove.2.patch nosy:
loewis, rhettinger, terry.reedy, lars.gustaebel, rossmclendon, eric.araujo, ubershmekel, victorlee129, sandro.tosi, chroipahtz messages:
+ msg130938
|
| 2011-03-09 21:10:18 | eric.araujo | set | nosy:
+ eric.araujo
|
| 2011-03-09 20:54:29 | ubershmekel | set | files:
+ zipfile.remove.patch
messages:
+ msg130463 keywords:
+ patch nosy:
loewis, rhettinger, terry.reedy, lars.gustaebel, rossmclendon, ubershmekel, victorlee129, sandro.tosi, chroipahtz |
| 2011-03-09 00:21:46 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg130388
|
| 2011-03-08 00:10:01 | ubershmekel | set | nosy:
+ ubershmekel
messages:
+ msg130299 versions:
+ Python 3.3, - Python 3.1 |
| 2011-02-02 21:02:50 | sandro.tosi | set | keywords:
- patch nosy:
loewis, rhettinger, lars.gustaebel, rossmclendon, victorlee129, sandro.tosi, chroipahtz |
| 2011-02-02 21:02:37 | sandro.tosi | set | nosy:
+ sandro.tosi
|
| 2010-07-03 05:47:17 | chroipahtz | set | messages:
+ msg109160 |
| 2010-07-03 05:46:48 | chroipahtz | set | files:
- zipfile_remove.patch |
| 2010-07-03 05:26:41 | chroipahtz | set | files:
+ zipfile_remove.patch
nosy:
+ chroipahtz messages:
+ msg109158
keywords:
+ patch |
| 2009-10-26 07:58:57 | loewis | set | messages:
+ msg94478 |
| 2009-10-26 07:15:10 | victorlee129 | set | files:
+ delete.tar.gz versions:
+ Python 3.1, - Python 3.2 nosy:
+ victorlee129
messages:
+ msg94475
components:
- IO |
| 2009-09-02 18:54:26 | loewis | set | messages:
+ msg92177 |
| 2009-09-02 16:40:04 | rossmclendon | set | messages:
+ msg92172 |
| 2009-09-02 12:11:12 | lars.gustaebel | set | nosy:
+ lars.gustaebel messages:
+ msg92164
|
| 2009-09-02 07:00:17 | loewis | set | nosy:
+ loewis messages:
+ msg92158
|
| 2009-09-02 05:42:58 | rossmclendon | set | messages:
+ msg92156 |
| 2009-09-02 05:20:39 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg92155
|
| 2009-09-02 04:43:08 | rossmclendon | set | components:
+ IO |
| 2009-09-02 04:42:10 | rossmclendon | create | |