Message 53293 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	crhode
Recipients
Date	2005-09-22.15:56:58
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
Logged In: YES user_id=988879 I've been trying to read map files put out by the Census Bureau. These ZIP archives are downloaded from government contractors' sites by county. Within each county archive are several ZIP files for each map layer (roads, streams, waterbodies, etc). Each contains the elements of an ESRI shapefile database (.shp, .shx., and .dbf files). This doesn't make a lot of sense to me, either, because there's no compression advantage to making an archive of an archive. The technique is used purely for organizational purposes because ZIP does not compress subdirectories. Note: I've never seen a TAR of TAR files because TAR does compress subdirectories. What I've been struggling with is a way to leave these archives in their compressed form and still do python I/O on them. There is a tree organization to them, after all, just as with traditional os.path directories. I've designed some objects that let me retrieve the most recent file, ZIP member, or TAR member by name from a given path to a repository of such archives. What I get is a StreamIO object that I can subsequently put back where it came from. What would be nice is if there already were objects available to manipulate normal os.path directories comingled with ZIP and TAR archives. What would be nicer is if I/O could be opened at the character/line level transparently without regard to whether the source/destination was a file or an archive member within such a structure. In the days of hardware compression and on-the-fly encryption/decryption of I/O, is this too much to ask? -ccr-

Logged In: YES 
user_id=988879

I've been trying to read map files put out by the Census
Bureau.  These ZIP archives are downloaded from government
contractors' sites by county.  Within each county archive
are several ZIP files for each map layer (roads, streams,
waterbodies, etc).  Each contains the elements of an ESRI
shapefile database (.shp, .shx., and .dbf files).  This
doesn't make a lot of sense to me, either, because there's
no compression advantage to making an archive of an archive.
 The technique is used purely for organizational purposes
because ZIP does not compress subdirectories.

Note: I've never seen a TAR of TAR files because TAR *does*
compress subdirectories.

What I've been struggling with is a way to leave these
archives in their compressed form and still do *python* I/O
on them.  There is a tree organization to them, after all,
just as with traditional os.path directories.  I've designed
some objects that let me retrieve the most recent file, ZIP
member, or TAR member by name from a given path to a
repository of such archives.  What I get is a StreamIO
object that I can subsequently put back where it came from.

What would be nice is if there already were objects
available to manipulate normal os.path directories comingled
with ZIP and TAR archives.  What would be nicer is if I/O
could be opened at the character/line level transparently
without regard to whether the source/destination was a file
or an archive member within such a structure.  In the days
of hardware compression and on-the-fly encryption/decryption
of I/O, is this too much to ask?  -ccr-

History
Date	User	Action	Args
2007-08-23 16:01:37	admin	link	issue467924 messages
2007-08-23 16:01:37	admin	create