This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients aronacher, carljm, loewis, vstinner
Date 2011-12-14.21:07:11
SpamBayes Score 1.5822774e-08
Marked as misclassified No
Message-id <4EE90FFE.8000908@v.loewis.de>
In-reply-to <1323895551.16.0.870771052965.issue11574@psf.upfronthosting.co.za>
Content
> One might say, "ok, this is a bug in distutils/distribute, it should
> explicitly specify UTF-8 encoding when writing egg-info." But if this
> is a sensible thing for distutils/distribute to do, regardless of
> user locale, why would it not be equally sensible for Python itself
> to have the default output encoding always be UTF-8 (with the ability
> for a developer who wants to support arbitrary user locale to
> explicitly do so)?

The file encoding is part of the file format. Just as Python can't know
what the file format is (else it could allow writing, say, dictionaries
to a file), it can't know what the file encoding is, either - there is
a need to guess. distutils *does* know the format, so it's clearly a
bug in distutils and not in Python.

The Zen says "In the face of ambiguity, refuse the temptation to guess."
From that point of view, Python should just refuse to open files in text
mode with no encoding specified. However, it also says "Although
practicality beats purity.", which brings us back to guessing.

Guessing the "best" file encoding is really tricky, and Python has
chosen to use the locale's encoding. That can't be changed anymore
(except perhaps by PEP) since it would be an incompatible change.
History
Date User Action Args
2011-12-14 21:07:12loewissetrecipients: + loewis, vstinner, aronacher, carljm
2011-12-14 21:07:11loewislinkissue11574 messages
2011-12-14 21:07:11loewiscreate