Author benjamin.peterson
Recipients Arfrever, benjamin.peterson, lemburg, loewis, serhiy.storchaka
Date 2017-03-10.07:37:03
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Do you believe this program should work?

import locale, os
for l in open("/usr/share/i18n/SUPPORTED"):
    alias, encoding = l.strip().split()
    locale.setlocale(locale.LC_ALL, alias)
        enc = locale.getlocale()[1]
    except ValueError:
        continue # not in table
    normalized = enc.replace("ISO", "ISO-"). \
                     replace("_", "-"). \
                     replace("euc", "EUC-"). \
                     replace("big5", "big5-").upper()
    assert normalized == locale.nl_langinfo(locale.CODESET)

After my change it does—the encoding returned from getlocale() is the one actually being used by glibc. It fails dramatically on earlier versions of Python (for example on the en_IN example from #29571.) I don't understand why Python needs to editorialize whatever choices libc or the system administrator has made.

Is getlocale() expected to return something different from the underlying C locale?

In fact, why have this table at all instead of using nl_langinfo to return the encoding for the current locale?
Date User Action Args
2017-03-10 07:37:04benjamin.petersonsetrecipients: + benjamin.peterson, lemburg, loewis, Arfrever, serhiy.storchaka
2017-03-10 07:37:04benjamin.petersonsetmessageid: <>
2017-03-10 07:37:04benjamin.petersonlinkissue20087 messages
2017-03-10 07:37:03benjamin.petersoncreate