This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients ezio.melotti, georg.brandl, lemburg, mrabarnett, pitrou
Date 2009-05-03.07:07:17
SpamBayes Score 8.582045e-07
Marked as misclassified No
Message-id <1241334441.64.0.193171835678.issue5902@psf.upfronthosting.co.za>
In-reply-to
Content
Actually I'd like to have some kind of convention mainly when the user
writes the encoding as a string, e.g. s.encode('utf-8'). Indeed, if the
encoding comes from a webpage or somewhere else it makes sense to have
some flexibility.

I think that 'utf-8' is the most widely used name for the UTF-8 codec
and it's not even mentioned in the table of the standard encodings. So
someone will use 'utf-8', someone else 'utf_8' and some users could even
pick one of the aliases, like 'U8'.

Probably is enough to add 'utf-8', 'iso-8859-1' and similar as
"preferred form" and explain why and how the codec names are normalized
and what are the valid aliases.

Regarding the ambiguity of 'UTF', it is not the only one, there's also
'LATIN' among the aliases of ISO-8859-1.
History
Date User Action Args
2009-05-03 07:07:22ezio.melottisetrecipients: + ezio.melotti, lemburg, georg.brandl, pitrou, mrabarnett
2009-05-03 07:07:21ezio.melottisetmessageid: <1241334441.64.0.193171835678.issue5902@psf.upfronthosting.co.za>
2009-05-03 07:07:20ezio.melottilinkissue5902 messages
2009-05-03 07:07:18ezio.melotticreate