Message129280
I think that the normalization function in unicodeobject.c (only used for internal functions) can skip any character different than a-z, A-Z and 0-9. Something like:
>>> import re
>>> def normalize(name): return re.sub("[^a-z0-9]", "", name.lower())
...
>>> normalize("UTF-8")
'utf8'
>>> normalize("ISO-8859-1")
'iso88591'
>>> normalize("latin1")
'latin1'
So ISO-8859-1, ISO885-1, LATIN-1, latin1, UTF-8, utf8, etc. will be normalized to iso88591, latin1 and utf8.
I don't know any encoding name where a character outside a-z, A-Z, 0-9 means anything special. But I don't know all encoding names! :-) |
|
Date |
User |
Action |
Args |
2011-02-24 16:20:10 | vstinner | set | recipients:
+ vstinner, lemburg, jcea, belopolsky, ezio.melotti, eric.araujo, sdaoden |
2011-02-24 16:20:10 | vstinner | set | messageid: <1298564410.15.0.973479289946.issue11303@psf.upfronthosting.co.za> |
2011-02-24 16:20:06 | vstinner | link | issue11303 messages |
2011-02-24 16:20:06 | vstinner | create | |
|