Author nadeem.vawda
Recipients eric.araujo, hynek, nadeem.vawda, proyvind, tarek
Date 2012-02-22.09:04:11
SpamBayes Score 1.11022e-15
Marked as misclassified No
Message-id <1329901453.55.0.477493086331.issue5411@psf.upfronthosting.co.za>
In-reply-to
Content
> So, do you agree that “not automated but not ugly” is better than “automated with ugly klutches”?

Definitely. If we have to add special cases that are almost as long as
the original code, the "automation" seems pointless.

> Note that there is a way to get fully automated support for tar formats: tarfile could expose, in addition to the list compression_formats, another structure with the descriptions (e.g. “gzip’ed tar file”) and file extensions (e.g. ['.gz', '.tgz'] —no, it’s not '.tar.gz', which is unfortunate, and could cause Lars to reject that idea).  I’m just putting this here for reference, but my preference is still for the second idea I talk about in my precedent message.

As much as it would be nice to have all of the information centralized in
one place, I don't think this is practical - some of the data that would
be stored in this structure really is specific to shutil, so I agree that
your second option is better.

I think we can restructure the code a bit to reduce the work involved in
adding a new format, though. Maybe something like this:

    _ARCHIVE_FORMATS = {}
    _UNPACK_FORMATS = {}

    for name, tarname, progname, extensions, desc in [
            ("xz", "xztar", "xz", [".xz"], "xz'ed tar file"),
            ("bz2", "bztar", "bzip2", [".bz2"], "bzip2'ed tar file"),
            ("gz", "gztar", "gzip", [".gz", ".tgz"], "gzip'ed tar file")]:
        if name in tarfile.compression_formats:
            _ARCHIVE_FORMATS[tarname] = _make_tarball, [("compress", progname)], desc
            _UNPACK_FORMATS[tarname] = extensions, _unpack_tarfile, [], desc

By the way, is there any reason why you've used ".gz" instead of
".tar.gz" as the extension for "gztar" in your proposed change? The
current version of shutil.py uses ".tar.gz", but has ".bz2" for "bztar".


Also, I notice that _make_tarball() will need to be updated to add the
info for xz to the tar_compression and compress_ext dictionaries. It
seems a bit ugly to duplicate this information when it's already present
in the format dicts, so maybe it would be better to pass the file
extension in directly instead? I assume that _make_tarball() is not part
of shutil's public API.

If we did this, there would be no need for a separate "progname" field
in the initialization code I suggested above:

    for name, tarname, extensions, desc in [
            ("xz", "xztar", [".xz"], "xz'ed tar file"),
            ("bz2", "bztar", [".bz2"], "bzip2'ed tar file"),
            ("gz", "gztar", [".gz", ".tgz"], "gzip'ed tar file")]:
        if name in tarfile.compression_formats:
            _ARCHIVE_FORMATS[tarname] = _make_tarball, [("compress", name)], desc
            _UNPACK_FORMATS[tarname] = extensions, _unpack_tarfile, [], desc
History
Date User Action Args
2012-02-22 09:04:13nadeem.vawdasetrecipients: + nadeem.vawda, tarek, eric.araujo, proyvind, hynek
2012-02-22 09:04:13nadeem.vawdasetmessageid: <1329901453.55.0.477493086331.issue5411@psf.upfronthosting.co.za>
2012-02-22 09:04:12nadeem.vawdalinkissue5411 messages
2012-02-22 09:04:11nadeem.vawdacreate