classification
Title: Introduce fixed point locale aware format type for floating point numbers
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, mark.dickinson, serhiy.storchaka, skrah, steelman, vstinner
Priority: normal Keywords: patch, patch, patch, patch

Created on 2019-01-02 11:25 by steelman, last changed 2019-01-09 13:14 by vstinner.

Pull Requests
URL Status Linked Edit
PR 11405 open python-dev, 2019-01-02 15:31
PR 11405 open python-dev, 2019-01-02 15:31
PR 11405 python-dev, 2019-01-02 15:31
PR 11405 open python-dev, 2019-01-02 15:31
Messages (17)
msg332863 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-02 11:25
It is currently impossible to format floating point numbers with an arbitrary number of decimal digits AND the decimal point matching locale settings. For example no current format allows to display numbers ranging from 1 to 1000 with exactly two decimal digits.
msg332876 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-01-02 13:32
Since this is a new feature, it can only be added to 3.8. Adjusting versions accordingly.

I suggest that if we add this at all, it only be added to __format__, not to %-formatting.

Any suggestions on a specification for this?
msg332877 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-02 14:14
I've got the patch. I will push it to github as soon as I can (some technical issues).
msg332878 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-01-02 14:15
Before a patch is created, we should discuss the behavior that will be implemented and agree on it. What is your suggestion?
msg332879 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-01-02 14:16
Of course, feel free to create a PR. But the correct place to discuss any new behavior is on the issue tracker, or maybe on python-ideas, not in a PR.
msg332880 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-02 15:33
I have created a new format "m" that is for "n", what "f" is for "g". The patch for string.rst says

   +---------+----------------------------------------------------------+
   | ``'m'`` | Number. This is the same as ``'f'``, except that it uses |
   |         | the current locale setting to insert the appropriate     |
   |         | number separator characters.                             |
   +---------+----------------------------------------------------------+

My patch only applies to floats not integers.
msg332890 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-01-02 19:32
I haven't looked at this closely yet, but you'll need to at least:
- add tests that the locale-aware formatting is happening
- support decimal
- make sure it works with complex (which it probably does, but needs a test)

And, I think we'll need to run this through python-ideas first. One thing I expect to come up there: why f and not g?

Again, I haven't looked through the code yet, or really even given any thought to determining if this is a sound idea.
msg332928 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-03 13:59
> I haven't looked at this closely yet, but you'll need to at least:
> - add tests that the locale-aware formatting is happening

Done.

> - support decimal
> - make sure it works with complex

Good points. Done. Please note, that there is an inconsistency between float/complex/int/_pydecimal(!) and decimal. The former provide only 'n' format type and the latter provides 'n' and 'N'. So I implemented 'm' and 'M' for decimal and 'm' for _pydecimal.

> (which it probably does, but needs a test)

There are no tests for 'n'. Should I create for both 'm' and 'n'?

> And, I think we'll need to run this through python-ideas first. One thing I expect to come up there: why f and not g?

Because 'g' has been already covered with 'n'.
msg332930 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-03 15:11
I think there's another open GitHub issue for this, and yes, probably
it should be discussed on python-ideas, too.

My main concern with 'm' for libmpdec is that I'd like to reserve it
for LC_MONETARY. There was one OS X issue that would have been solved
by adding LC_MONETARY support.

On the other hand perhaps '$' would also be possible for monetary.


So it appears that there might be some bikeshedding about the names
or whether the feature is needed at all.
msg332931 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-03 15:25
As much as I am open to any suggestions for naming and such (although I think 'm' together with 'n' are a good supplement for 'f' and 'g'), I really would like to introduce a method to format numbers with fixed number of decimal digits (it looks good in tables) and with separators from locale.
msg332932 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-03 15:29
For reference, the (one of the?) other GitHub issue(s) is here:

https://github.com/python/cpython/pull/8612


It actually proposes to use LC_MONETARY.
msg333000 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-01-04 20:47
You can use locale.format_string() for locale aware formatting.
msg333052 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-05 11:22
Indeed. Thank you. I was sure I had tried this. However, this is still only a workaround and not the solution I need. I am working on a project now which uses pint https://pint.readthedocs.io/en/latest/ which uses format() and its relatives.

With "n" format present Python is missing locale-aware "f" formatter anyway.
msg333270 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-01-09 00:51
Łukasz Stelmach:
> It is currently impossible to format floating point numbers with an arbitrary number of decimal digits AND the decimal point matching locale settings.

I would like to warn you that handling properly locales can be very tricky. I just wrote an article about that:
https://vstinner.github.io/locale-bugfixes-python3.html


Stefan Krah:
> My main concern with 'm' for libmpdec is that I'd like to reserve it
for LC_MONETARY.

Since it seems like we are still at the "idea" stage, would it make sense to add a function which accept options to choose how to format a number?

* decimal point
* thousands separator
* grouping

Because there are more and more format variants. See for example Python/formatter_unicode.c. It has 5 "locale types":

* LT_NO_LOCALE
* LT_DEFAULT_LOCALE
* LT_UNDERSCORE_LOCALE
* LT_UNDER_FOUR_LOCALE
* LT_CURRENT_LOCALE

and it uses this structure:

/* Locale info needed for formatting integers and the part of floats
   before and including the decimal. Note that locales only support
   8-bit chars, not unicode. */
typedef struct {
    PyObject *decimal_point;
    PyObject *thousands_sep;
    const char *grouping;
    char *grouping_buffer;
} LocaleInfo;

There is the locale but also "underscore" separator for thousands: see PEP 515.

I'm not talking about adding something into format(), but add a method to float maybe. Or add a function somewhere else.

--

By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote https://github.com/python/cpython/pull/5191 but I abandonned my change.
msg333296 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-01-09 10:09
> Since it seems like we are still at the "idea" stage, would it make sense to add a function which accept options to choose how to format a number?

Maybe, but I think for format() Eric's latest proposal on python-ideas is great ("*f" for "f + LC_NUMERIC", "$f" for "f + LC_MONETARY".

For me that's sufficient. Does locale.format_string() handle the other cases?


> By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote https://github.com/python/cpython/pull/5191 but I abandonned my change.

Well, I *discovered and opened* #7442 several years ago, and you said:

"I see that various people contributed to the issue, but it looks like the only user asking for the request is Stefan Krah. I prefer to close the issue and wait until more users ask for it before considering again the patch, or find a different way to implement the feature (support LC_NUMERIC and LC_CTYPE locales using a different encoding)."


So why would you think that I'm not aware of that issue? It has low priority for me and I hesitate to depend on the official locale functions in decimal because I don't want to be involved in additional issue reports in that area.
msg333297 - (view) Author: Łukasz Stelmach (steelman) * Date: 2019-01-09 10:49
I'd appreciate, if we continued the discussion at python-ideas, where I posted the idea[1]. There has already been several valuable comments.

[1] https://mail.python.org/pipermail/python-ideas/2019-January/054793.html
msg333315 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-01-09 13:14
> By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote https://github.com/python/cpython/pull/5191 but I abandonned my change.

FYI I opened bpo-35697 to discuss the decimal module case.
History
Date User Action Args
2019-01-09 13:14:43vstinnersetkeywords: patch, patch, patch, patch

messages: + msg333315
2019-01-09 10:49:57steelmansetmessages: + msg333297
2019-01-09 10:09:23skrahsetkeywords: patch, patch, patch, patch

messages: + msg333296
2019-01-09 00:51:59vstinnersetkeywords: patch, patch, patch, patch
nosy: + vstinner
messages: + msg333270

2019-01-05 11:22:57steelmansetmessages: + msg333052
2019-01-04 20:47:36serhiy.storchakasetkeywords: patch, patch, patch, patch
nosy: + serhiy.storchaka
messages: + msg333000

2019-01-03 15:29:24skrahsetkeywords: patch, patch, patch, patch

messages: + msg332932
2019-01-03 15:25:40steelmansetmessages: + msg332931
2019-01-03 15:11:18skrahsetkeywords: patch, patch, patch, patch
nosy: + skrah
messages: + msg332930

2019-01-03 13:59:32steelmansetmessages: + msg332928
2019-01-02 19:32:34eric.smithsetkeywords: patch, patch, patch, patch

messages: + msg332890
title: Introduce fixed point locale awear format type for floating point numbers -> Introduce fixed point locale aware format type for floating point numbers
2019-01-02 17:42:51mark.dickinsonsetkeywords: patch, patch, patch, patch
nosy: + mark.dickinson
2019-01-02 15:33:09steelmansetmessages: + msg332880
2019-01-02 15:31:14python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request10795
2019-01-02 15:31:09python-devsetkeywords: + patch
stage: (no value)
pull_requests: + pull_request10796
2019-01-02 15:31:05python-devsetkeywords: + patch
stage: (no value)
pull_requests: + pull_request10794
2019-01-02 15:31:00python-devsetkeywords: + patch
stage: (no value)
pull_requests: + pull_request10793
2019-01-02 14:16:58eric.smithsetmessages: + msg332879
2019-01-02 14:15:58eric.smithsetmessages: + msg332878
2019-01-02 14:14:03steelmansetmessages: + msg332877
2019-01-02 13:32:30eric.smithsetversions: - Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7
nosy: + eric.smith

messages: + msg332876

components: + Interpreter Core, - Library (Lib)
2019-01-02 11:25:01steelmancreate