This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lars.gustaebel
Recipients lars.gustaebel, lemburg, loewis, vstinner
Date 2010-06-10.18:51:57
SpamBayes Score 0.0023585665
Marked as misclassified No
Message-id <1276195921.05.0.32250589087.issue8784@psf.upfronthosting.co.za>
In-reply-to
Content
Maybe I'm going out on a limb here, but I think we should again consider what tarfile users on Windows(!) actually use it for under which circumstances. The following list is probably not exhaustive, but IMHO covers 90%:

1. Download tar archives from a webpage (when no zip is supplied) for viewing or extracting.
2. Create backups for personal use.
3. Create source archives from a project for unix users who hate zipfiles.

I am convinced that the tarfile module is not very popular on Windows, because of a simple reason: tar archives are not. Windows users will always prefer zip archives and hence the zipfile module, because it's something they're familiar with.

The point I am trying to make is, that, first, we should not choose a default encoding based on what works best with WinRAR, 7-zip and such, because they all act very differently which makes it impossible. Second, we must not overemphasize the encoding issue to a point where portability is in danger. This means that in almost all real-life cases there are no encoding issues. In my whole tarfile maintaining career I cannot remember a single incident of a tar archive that I got from an external source that contained special characters. The only tar archives that contain special characters in my experience are backups. But: these backups are created and later restored on one and the same system. Again, no encoding issues.

Long story short, I still vote for utf-8, because it enables Windows users to create backups without losing special characters, and it's ASCII-"compatible" and should be able to read 99% of the files that you get from the internet.
History
Date User Action Args
2010-06-10 18:52:01lars.gustaebelsetrecipients: + lars.gustaebel, lemburg, loewis, vstinner
2010-06-10 18:52:01lars.gustaebelsetmessageid: <1276195921.05.0.32250589087.issue8784@psf.upfronthosting.co.za>
2010-06-10 18:51:58lars.gustaebellinkissue8784 messages
2010-06-10 18:51:57lars.gustaebelcreate