This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author jaraco
Recipients RonnyPfannschmidt, alexis, eric.araujo, jaraco, jens, mikehoy, mu_mind, tarek
Date 2011-12-20.01:41:36
SpamBayes Score 1.0450529e-12
Marked as misclassified No
Message-id <1324345298.6.0.015765671183.issue11638@psf.upfronthosting.co.za>
In-reply-to
Content
I've created a repo to continue this work. I've integrated David's patch (thanks).

It's not obvious to me what the encoding should be. Python and the tarfile module can accept unicode filenames. It seems that only the gzip part of tarfile fails if a unicode name is passed. Encoding to 'utf-8' or the default file system encoding doesn't seem right (as the characters end up getting stored in the gzip archive itself). Additionally, encoding as 'utf-8' would cause the file to be created with a utf-8 filename, which would be undesirable.

So in the current repo, I've created a check to convert the filename to ASCII. If it can be converted to ASCII, it is converted and passed through to tarfile. This should address the majority of users who have thus encountered this issue. For those who wish to use non-ascii characters in project names or versions, one will have to use Python 3 or wait until #13639 is fixed.

Please review the enclosed patch.

Since one test fails (and is known to fail), should it omitted? Can it remain but be marked as "expected to fail"?
History
Date User Action Args
2011-12-20 01:41:38jaracosetrecipients: + jaraco, tarek, eric.araujo, RonnyPfannschmidt, alexis, mu_mind, mikehoy, jens
2011-12-20 01:41:38jaracosetmessageid: <1324345298.6.0.015765671183.issue11638@psf.upfronthosting.co.za>
2011-12-20 01:41:38jaracolinkissue11638 messages
2011-12-20 01:41:36jaracocreate