This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients belopolsky, ezio.melotti, georg.brandl, lemburg, mrabarnett, pitrou
Date 2011-02-24.09:20:37
SpamBayes Score 5.899295e-07
Marked as misclassified No
Message-id <4D6622E4.7010003@egenix.com>
In-reply-to <1298513419.85.0.00455767376339.issue5902@psf.upfronthosting.co.za>
Content
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
>> Accepting all common forms for
>> encoding names means that you can usually give Python an encoding name
>> from, e.g. a HTML page, or any other file or system that specifies an
>> encoding.
> 
> I don't buy this argument.  Running attached script on http://www.iana.org/assignments/character-sets shows that there are hundreds of registered charsets that are not accepted by python:
> 
> $ ./python.exe iana.py| wc -l
>      413
> 
> Any serious HTML or XML processing software should be based on the IANA character-sets file rather than on the ad-hoc list of aliases that made it into encodings/aliases.py.

Let's do a reality check:

How often do you see requests for additions to the aliases we
have in Python ? Perhaps one every year, if at all.

We take great care not to add aliases that are not in common
use or that do not have a proven track record of really being
compatible to the codec in question.

If you think we are missing some aliases, please open tickets
for them, indicating why these should be added.

If you really want complete IANA coverage, I suggest you create
a normalization module which maps the IANA names to our names
and upload it to PyPI.
History
Date User Action Args
2011-02-24 09:20:45lemburgsetrecipients: + lemburg, georg.brandl, belopolsky, pitrou, ezio.melotti, mrabarnett
2011-02-24 09:20:37lemburglinkissue5902 messages
2011-02-24 09:20:37lemburgcreate