This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author a.badger
Recipients Arfrever, a.badger, asvetlov, ezio.melotti, r.david.murray, serhiy.storchaka, stefanholek, vstinner
Date 2013-03-21.00:54:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1363827261.26.0.705248323617.issue16310@psf.upfronthosting.co.za>
In-reply-to
Content
Okay, here's the first version of a patch to add surrogate support to a zipfile.  I think it's the minimum required to fix this bug.

When archiving, if a filename contains surrogateescape'd bytes, it switches to cp437 when it saves the filename into the zipfile.  This seems to be the strategy of other zip tools.  Nothing changes when unarchiving (probably to deal with what comes out of other tools).

The documentation is also updated to mention that unknown encodings are a problem that the zipfile module doesn't handle automatically for you.

I think we could do better but this is a major improvement over the status quo (no tracebacks).  Would someone care to review this for merge and then we could work on adding some notion of a user-specified encoding to override cp437 encoding on dearchiving.  (which I think would satisfy:  issue10614, issue10972).

The use case in issue10757 might be fixed by this patch (or this patch plus the user specified encoding).  Have to look a little harder at it.
History
Date User Action Args
2013-03-21 00:54:21a.badgersetrecipients: + a.badger, vstinner, ezio.melotti, Arfrever, r.david.murray, asvetlov, stefanholek, serhiy.storchaka
2013-03-21 00:54:21a.badgersetmessageid: <1363827261.26.0.705248323617.issue16310@psf.upfronthosting.co.za>
2013-03-21 00:54:21a.badgerlinkissue16310 messages
2013-03-21 00:54:21a.badgercreate