Author crhode
Recipients
Date 2005-09-22.15:56:58
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=988879

I've been trying to read map files put out by the Census
Bureau.  These ZIP archives are downloaded from government
contractors' sites by county.  Within each county archive
are several ZIP files for each map layer (roads, streams,
waterbodies, etc).  Each contains the elements of an ESRI
shapefile database (.shp, .shx., and .dbf files).  This
doesn't make a lot of sense to me, either, because there's
no compression advantage to making an archive of an archive.
 The technique is used purely for organizational purposes
because ZIP does not compress subdirectories.

Note: I've never seen a TAR of TAR files because TAR *does*
compress subdirectories.

What I've been struggling with is a way to leave these
archives in their compressed form and still do *python* I/O
on them.  There is a tree organization to them, after all,
just as with traditional os.path directories.  I've designed
some objects that let me retrieve the most recent file, ZIP
member, or TAR member by name from a given path to a
repository of such archives.  What I get is a StreamIO
object that I can subsequently put back where it came from.

What would be nice is if there already were objects
available to manipulate normal os.path directories comingled
with ZIP and TAR archives.  What would be nicer is if I/O
could be opened at the character/line level transparently
without regard to whether the source/destination was a file
or an archive member within such a structure.  In the days
of hardware compression and on-the-fly encryption/decryption
of I/O, is this too much to ask?  -ccr-
History
Date User Action Args
2007-08-23 16:01:37adminlinkissue467924 messages
2007-08-23 16:01:37admincreate