Message208655
If you live in a current-posix world, this might make sense. However, one can also argue that the filename should be *transcoded* from the tarfile encoding to the local FS filename encoding, which I believe is what we are currently doing. Which, if you are using POSIX as the locale, will fail a lot. If you use a sensible modern locale that includes utf-8, you wouldn't have a problem.
Unfortunately, the reality is probably that sometimes you want one behavior and sometimes you want the other :(
Encoding using member.encoding is probably wrong, though. If you are trying to preserve the original bytes, is is probably best do so, and not assume that the tarfile encoding field is valid.
I'm adding Victor Stinner to nosy: he's thought about these issues much more deeply than I have. The answer may be that we will only support transcoding filenames in our tarfile module...and certainly it looks like doing anything else, even if we want to, would be a new feature. |
|
Date |
User |
Action |
Args |
2014-01-21 15:33:31 | r.david.murray | set | recipients:
+ r.david.murray, vstinner, Laurent.Mazuel |
2014-01-21 15:33:31 | r.david.murray | set | messageid: <1390318411.18.0.548313596381.issue20329@psf.upfronthosting.co.za> |
2014-01-21 15:33:31 | r.david.murray | link | issue20329 messages |
2014-01-21 15:33:30 | r.david.murray | create | |
|