This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients lars.gustaebel, vstinner
Date 2010-05-22.01:22:09
SpamBayes Score 0.00011018223
Marked as misclassified No
Message-id <1274491335.43.0.732437213128.issue8784@psf.upfronthosting.co.za>
In-reply-to
Content
mbcs encoding replace non encodable characters (loose information) and doesn't support surrogateescape error handler. It ignores the error handler argument: see #850997, and tarfile now uses surrogateescape error handler by default (#8390). This encoding is just horrible for unicode support :-)

Since Windows native API use unicode character (UTF-16), I think that it would be better to use utf-8 for the default encoding on Windows. utf-8 is able to encode and decode the full Unicode charset and supports all error handlers (especially surrogateescape).

Attached patch sets the default encoding to utf-8 on Windows, and removes the test ENCODING is None because sys.getfilesystemencoding() cannot be None anymore (in 3.2 only, it's a recent change: #8610).
History
Date User Action Args
2010-05-22 01:22:15vstinnersetrecipients: + vstinner, lars.gustaebel
2010-05-22 01:22:15vstinnersetmessageid: <1274491335.43.0.732437213128.issue8784@psf.upfronthosting.co.za>
2010-05-22 01:22:13vstinnerlinkissue8784 messages
2010-05-22 01:22:12vstinnercreate