New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_decimal: Implement the previously rejected changes from #7442. #79878
Comments
The decimal module support formatting a number in the "n" formatting type if the LC_NUMERIC locale uses a different encoding than the LC_CTYPE locale. Example with attached decimal_locale.py on Fedora 29 with Python 3.7.2: $ python3 decimal_locale.py
LC_NUMERIC locale: uk_UA.koi8u
decimal_point: ',' = ',' = U+002c
thousands_sep: '\xa0' = '\xa0' = U+00a0
Traceback (most recent call last):
File "/home/vstinner/decimal_locale.py", line 16, in <module>
text = format(num, "n")
ValueError: invalid decimal point or unsupported combination of LC_CTYPE and LC_NUMERIC Attached PR modify the _decimal module to support this corner case. Note: I already wrote PR 5191 last year, but I abandoned the PR in the meanwhile. -- Supporting non-ASCII decimal point and thousands separator has a long history and a list of now fixed issues: I even wrote an article about these bugs :-) Python 3.7.2 now supports different encodings for LC_NUMERIC, LC_MONETARY and LC_CTYPE locales. format(int, "n") sets temporarily LC_CTYPE to LC_NUMERIC to decode decimal_point and thousands_sep from the correct encoding. The LC_CTYPE locale is only changed if it's different than LC_NUMERIC locale and if the decimal point and/or thousands separator is non-ASCII. It's implemented in this function: int
_Py_GetLocaleconvNumeric(struct lconv *lc,
PyObject **decimal_point, PyObject **thousands_sep) Function used by locale.localeconv() and format() (for "n" type). I decided to fix the bug when I was fixing other locale bugs because we now got enough bug reports. Copy of my msg309980: """
Past 10 years, I repeated to every single user I met that "Python 3 is right, your system setup is wrong". But that's a waste of time. People continue to associate Python3 and Unicode to annoying bugs, because they don't understand how locales work. Instead of having to repeat to each user that "hum, maybe your config is wrong", I prefer to support this non convential setup and work as expected ("it just works"). With my latest implementation, setlocale() is only done when LC_CTYPE and LC_NUMERIC are different, which is the corner case which "shouldn't occur in practice". |
Oh, I was wrong: bpo-25812 has not been fixed yet. |
Since bpo-7442 (again, *I* discovered this and it is *mentioned* in the |
Ok, I wrote PR 11474. Correct result with this PR: $ ./python decimal_locale.py
LC_NUMERIC locale: uk_UA.koi8u
decimal_point: ',' = ',' = U+002c
thousands_sep: '\xa0' = '\xa0' = U+00a0
format: '1\xa0200,5' = 1 200,5 = U+0031 U+00a0 U+0032 U+0030 U+0030 U+002c U+0035 |
Stefan Krah:
I closed bpo-7442 in the meanwhile, so I opened this new issue specific to the decimal module.
Well, here you have :-) A bug report about decimal. I let you decide what to do with this bug. I wrote a fix. It's up to you to merge it, reject it or do nothing :-) I wanted to make sure that the bug is fixed in all parts of the Python stdlib. |
Don't you find it strange to close bpo-7442 in mutual agreement and now |
Also Marc-Andre does not consider this a bug in bpo-31900. The |
Stefan Krah:
We agreed to close the bug in 2014. In the meanwhile, more and more people reported the same bug (multiple similar bug reports and more and more frequent messages). So I decided to fix the bug instead of explaining to users that they must not do that :-)
Aha, maybe I misunderstood him when he wrote (msg309981): "Sounds like a good compromise :-)" -- I'm not sure of what you are asking here. You are the maintainer of the decimal module. If you consider that it's not worth it, just close the issue. It's up to you ;-) |
Oh, I forgot to mention the context: I reported this issue as follow-up to discussions on bpo-35638: "Introduce fixed point locale aware format type for floating point numbers". |
Extract of Stefan Krah's msg333296 of bpo-35638:
Oh, I never said that. I wrote "By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote #5191 but I abandonned my change." as a reminder for myself. I found again the bug when I wrote my article, and I realized that I abandoned my PR when I had my burnout, and not because the fix was "officially" rejected. So I opened this issue to get an official statement :-)
As I wrote in my PR, I'm not very happy of my proposed implementation. But let's discuss options to fix it :-) I don't understand "I don't want to be involved in additional issue reports in that area". If the bug is fixed, why do you expect more bug reports? You wrote that nobody reported any issue related to formatting a decimal number using the locale since 2009. |
I mean issue reports like bpo-33954 or bpo-35195. These are just But if functions like _PyUnicode_InsertThousandsGrouping() Now I don't have to. I'd investigate bpo-35195 for example, perhaps Sometimes not being dependent on API functions is a virtue if |
That is a reasonable position to take. |
I'm no longer interested to rewrite my patch to avoid _Py_GetLocaleconvNumeric() which comes from the internal API, so I close my PR. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: