Issue38805
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-11-14 23:45 by markgrandi, last changed 2022-04-11 14:59 by admin.
Messages (2) | |||
---|---|---|---|
msg356637 - (view) | Author: Mark Grandi (markgrandi) * | Date: 2019-11-14 23:45 | |
It seems that something with windows 10, python 3.8, or both changed where `locale.getlocale()` is now returning strange results According to the documentation: https://docs.python.org/3/library/locale.html?highlight=locale%20getlocale#locale.getlocale , the language code should be in RFC1766 format: Language-Tag = Primary-tag *( "-" Subtag ) Primary-tag = 1*8ALPHA Subtag = 1*8ALPHA Whitespace is not allowed within the tag. but in python 3.8, I am getting a language code that doesn't meet RFC1766 specs: PS C:\Users\auror> py -3 Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import platform; platform.platform() 'Windows-10-10.0.18362-SP0' >>> import locale; locale.getlocale(); locale.getdefaultlocale() ('English_United States', '1252') ('en_US', 'cp1252') >>> on the same machine, with python 3.7.4: PS C:\Python37> .\python.exe Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import platform; platform.platform() 'Windows-10-10.0.18362-SP0' >>> import locale; locale.getlocale(); locale.getdefaultlocale() (None, None) ('en_US', 'cp1252') >>> also interesting that the encoding is different in py3.8 between `locale.getlocale()` and `locale.getdefaultlocale()`, being '1252' and 'cp1252', this might not be related though as it was present in python 3.7.4 these issues might be related, but stuff found hwen searching for 'locale' bugs: https://bugs.python.org/issue26024 https://bugs.python.org/issue37945 |
|||
msg373896 - (view) | Author: Riccardo Polignieri (ricpol) | Date: 2020-07-18 12:11 | |
> `locale.getlocale()` is now returning strange results Not really "strange results" - fact is, now "getlocale()" returns the locale name *as if* it were already set from the beginnning (because it is, at least in part). Before: >>> import locale # Python 3.7, new shell >>> locale.getlocale() (None, None) >>> locale.setlocale(locale.LC_ALL, '') # say Hi from Italy 'Italian_Italy.1252' >>> locale.getlocale() ('Italian_Italy', '1252') now: >>> import locale # Python 3.8, new shell >>> locale.getlocale() ('Italian_Italy', '1252') As for why returned locale names are "a little different" in Windows, I found no better explanation that Eryk Sun's essays in https://bugs.python.org/issue37945. Long story short, it's not even a bug anymore... it's a hot mess and it won't be solved anytime soon. But it's not the problem at hand, here. Returned locale names have not changed between 3.7 and 3.8. What *is* changed, though, is that now Python on Windows appears to set the locale, implicitly, right from the start. Except - maybe it does not, really: >>> import locale # Python 3.8, new shell >>> locale.getlocale() ('Italian_Italy', '1252') >>> locale.localeconv() {'int_curr_symbol': '', 'currency_symbol': '', 'mon_decimal_point': '', 'mon_thousands_sep': '', 'mon_grouping': [], 'positive_sign': '', 'negative_sign': '', 'int_frac_digits': 127, 'frac_digits': 127, 'p_cs_precedes': 127, 'p_sep_by_space': 127, 'n_cs_precedes': 127, 'n_sep_by_space': 127, 'p_sign_posn': 127, 'n_sign_posn': 127, 'decimal_point': '.', 'thousands_sep': '', 'grouping': []} As you can see, we have an Italian locale only in the name: the conventions are still those of the default C locale. If we explicitly set the locale, on the other hand... >>> locale.setlocale(locale.LC_ALL, '') 'Italian_Italy.1252' >>> locale.localeconv() {'int_curr_symbol': 'EUR', 'currency_symbol': '€', ... ... } ... now we enjoy a real Italian locale - pizza, pasta, gelato and all. What happened? Unfortunately, this change of behaviour is NOT documented, except for a passing note here: https://docs.python.org/3/whatsnew/changelog.html#id144. It's buried *very* deep: """ bpo-34485: On Windows, the LC_CTYPE is now set to the user preferred locale at startup. Previously, the LC_CTYPE locale was “C” at startup, but changed when calling setlocale(LC_CTYPE, “”) or setlocale(LC_ALL, “”). """ This explains... something. Python now pre-sets *only* the LC_CTYPE category, and that's why the other conventions remain unchanged. Unfortunately, setting *that* one category changes the result yielded by locale.getlocale(). But this is not a bug either, because it's the same behaviour you would have in Python 3.7: >>> locale.setlocale(locale.LC_CTYPE, '') # Python 3.7 'Italian_Italy.1252' >>> locale.getlocale() ('Italian_Italy', '1252') ...and that's because locale.getlocale() with no arguments default, wait for it, to getlocale(category=LC_CTYPE), as documented! So, why Python 3.8 now pre-sets LC_CTYPE on Windows? Apparently, bpo-34485 is part of the ongoing shakespearian feud between Victor Stinner and the Python locale code. If you squint hard enough, you will see the answer here: https://vstinner.github.io/locale-bugfixes-python3.html but at this point, I don't know if anyone still keeps the score. To sum up: - there's nothing new about locale names - still the same mess; - if locale names as returned by locale.getlocale() bother you, you should follow Victor's advice here: https://bugs.python.org/issue37945#msg361806. Use locale.setlocale(category, None) instead; - if you relied on getlocale() with no arguments to test your locale, assuming that either a locale is unset or it is "fully set", then you should stop now. A locale can also be "partially set", and in fact it's just what happens now on Windows by default. You should test for a specific category instead; - changing the way the locale is set by default on Windows can be... rather surprising and can lead to misunderstandings. I would certainly add a note in the locale documentation to explain this new behaviour. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:23 | admin | set | github: 82986 |
2020-07-18 12:11:43 | ricpol | set | nosy:
+ ricpol messages: + msg373896 |
2019-11-15 20:48:00 | terry.reedy | set | nosy:
+ lemburg |
2019-11-14 23:45:29 | markgrandi | create |