This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author stefanholek
Recipients ezio.melotti, r.david.murray, serhiy.storchaka, stefanholek
Date 2012-10-24.14:19:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1351088358.34.0.886438747794.issue16310@psf.upfronthosting.co.za>
In-reply-to
Content
A little more context perhaps:

The use-case is building Python distributions containing non-ASCII filenames. These seemingly "invalid" filenames can occur in real-life when the files have been created by, say, a 'git clone' operation.

So yes, I have Latin-1 bytes on the filesystem, even though my locale is UTF-8. And yes, Python 3 decodes that filename using surrogates. Creating .tar.gz distributions in this situation appears to work (even re-creating the foreign bytes when the archive is later extracted), whereas .zip archives fail in the way described above.

I was hoping zipfile could be made to work the same as tarfile in this regard. Concerns for standards certainly didn't keep tarfile from supporting surrogates. ;-)
History
Date User Action Args
2012-10-24 14:19:18stefanholeksetrecipients: + stefanholek, ezio.melotti, r.david.murray, serhiy.storchaka
2012-10-24 14:19:18stefanholeksetmessageid: <1351088358.34.0.886438747794.issue16310@psf.upfronthosting.co.za>
2012-10-24 14:19:18stefanholeklinkissue16310 messages
2012-10-24 14:19:18stefanholekcreate