Message295974
shutil.make_archive currently just uses the default tar format, which is GNU_FORMAT.
This format doesn't ensure that all character paths are encoded as UTF-8, and hence may end up embedding platform specific encoding assumptions into the generated tarball.
I see a few possible ways of resolving this:
1. Change the default tar format to PAX_FORMAT. It's been 16 years since that was defined, and Python itself has supported it since 2.6 was released in 2008, so perhaps we can rely on other tools supporting it now? (My main open question on that front would be "What happens if you specify "format=GNU_FORMAT" when attempting to read a PAX formatted archive?)
2. Add new shutil level "pax", "gzpax", "bzpax", "xzpax" format definitions to explicitly request PAX_FORMAT
3. Add a mechanism to shutil.make_archive that allows format-dependent settings to be based down to the underlying archive creation functions (e.g. "format=tarfile.PAX_FORMAT"). |
|
Date |
User |
Action |
Args |
2017-06-14 01:25:39 | ncoghlan | set | recipients:
+ ncoghlan |
2017-06-14 01:25:39 | ncoghlan | set | messageid: <1497403539.2.0.760105950979.issue30661@psf.upfronthosting.co.za> |
2017-06-14 01:25:39 | ncoghlan | link | issue30661 messages |
2017-06-14 01:25:38 | ncoghlan | create | |
|