This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients loewis, vstinner
Date 2010-04-13.23:53:12
SpamBayes Score 1.2901943e-07
Marked as misclassified No
Message-id <1271202795.91.0.928981298967.issue8390@psf.upfronthosting.co.za>
In-reply-to
Content
When reading a tar archive, tarfile decodes fields using "replace" error handler by default. The result is that we loose informations if there is an undecodable character.

Since the PEP 383, undecodable filenames are stored using surrogates in Python3. I think that it's a good idea to use surrogates for tar, because it's a common problem to have undecodable data in a tar archive (see the unicode section of the tarfile documentation).
History
Date User Action Args
2010-04-13 23:53:16vstinnersetrecipients: + vstinner, loewis
2010-04-13 23:53:15vstinnersetmessageid: <1271202795.91.0.928981298967.issue8390@psf.upfronthosting.co.za>
2010-04-13 23:53:14vstinnerlinkissue8390 messages
2010-04-13 23:53:13vstinnercreate