This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Roddy Shuler
Recipients Roddy Shuler
Date 2015-08-10.18:04:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1439229863.39.0.118048991617.issue24838@psf.upfronthosting.co.za>
In-reply-to
Content
GNU and USTAR formats use a special case if the file path is longer than 100 bytes. The detection for this, though, incorrectly checked for 100 characters rather than 100 bytes. So, if the length was close to but not exceeding 100 characters and included special characters such that the encoded length is greater than 100 bytes, the encoded string was truncated to 100 bytes and thus the resulting file name was truncated within the tar file.

For example...

/gt-education/Colección Educativa Guatemala/thumbs/Libro de Texto Comunicacion y Lenguaje 1 Grado.jpg

is truncated as:

/gt-education/Colección Educativa Guatemala/thumbs/Libro de Texto Comunicacion y Lenguaje 1 Grado.jp

The attached patch fixes this.  Initially found on Python 3.3.  Patch is tested on Linux with version 3.4.3-6 from Debian.  Looking at the source code, I am pretty confident that the problem still exists upstream in Python 3.5.
History
Date User Action Args
2015-08-10 18:04:23Roddy Shulersetrecipients: + Roddy Shuler
2015-08-10 18:04:23Roddy Shulersetmessageid: <1439229863.39.0.118048991617.issue24838@psf.upfronthosting.co.za>
2015-08-10 18:04:23Roddy Shulerlinkissue24838 messages
2015-08-10 18:04:23Roddy Shulercreate