Message87034
Actually I'd like to have some kind of convention mainly when the user
writes the encoding as a string, e.g. s.encode('utf-8'). Indeed, if the
encoding comes from a webpage or somewhere else it makes sense to have
some flexibility.
I think that 'utf-8' is the most widely used name for the UTF-8 codec
and it's not even mentioned in the table of the standard encodings. So
someone will use 'utf-8', someone else 'utf_8' and some users could even
pick one of the aliases, like 'U8'.
Probably is enough to add 'utf-8', 'iso-8859-1' and similar as
"preferred form" and explain why and how the codec names are normalized
and what are the valid aliases.
Regarding the ambiguity of 'UTF', it is not the only one, there's also
'LATIN' among the aliases of ISO-8859-1. |
|
Date |
User |
Action |
Args |
2009-05-03 07:07:22 | ezio.melotti | set | recipients:
+ ezio.melotti, lemburg, georg.brandl, pitrou, mrabarnett |
2009-05-03 07:07:21 | ezio.melotti | set | messageid: <1241334441.64.0.193171835678.issue5902@psf.upfronthosting.co.za> |
2009-05-03 07:07:20 | ezio.melotti | link | issue5902 messages |
2009-05-03 07:07:18 | ezio.melotti | create | |
|