Issue43624
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-03-25 17:19 by Terry Davis, last changed 2022-04-11 14:59 by admin.
Messages (17) | |||
---|---|---|---|
msg389508 - (view) | Author: Terry Davis (Terry Davis) | Date: 2021-03-25 17:19 | |
Proposal: Enable this >>> format(12_34_56.12_34_56, '_._f') '123_456.123_456' Where now only this is possible >>> format(12_34_56.12_34_56, '_.f') '123_456.123456' Based on the discussion in the Ideas forum, three core devs support this addition. https://discuss.python.org/t/add-underscore-as-a-thousandths-separator-for-string-formatting/7407 I'm willing to give this a try if someone points me to where to add tests and where the float formatting code is. This would be my first CPython contribution. The feature freeze for 3.10 is 2021-05-03. https://www.python.org/dev/peps/pep-0619/#id5 |
|||
msg389512 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2021-03-25 17:56 | |
IIRC there is ISO recommending that after the decimal point, digits be arranged in groups of five. I think is also how printed reference tables are typically formatted. |
|||
msg389517 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2021-03-25 18:45 | |
Some brief research =================== """ in numbers four or more digits long, use commas to set off groups of three digits, counting leftward from the decimal point, in the standard American style. For long decimal numbers, do not use any digit-group separators to the right of the decimal point.""" โ Google Style Guide https://developers.google.com/style/numbers The CRC math handbook uses groups of five after the decimal point. See ยง1.2.4 in http://dl.icdst.org/pdfs/files/2a2cbcfc89598fd83c315ce45c1ee663.pdf NIST Guide for using SI units: """The digits of numerical values having more than four digits on either side of the decimal marker are separated into groups of three using a thin, fixed space counting from both the left and right of the decimal marker. For example, 15 739.012 53 is highly preferred to 15739.01253. Commas are not used to separate digits into groups of three. (See Sec. 10.5.3.)""" โ page vi in https://physics.nist.gov/cuu/pdf/sp811.pdf#10.5.2 StackExchange question on the topic: https://math.stackexchange.com/questions/182775/convention-of-digit-grouping-after-decimal-point The important reference, ISO 80000:1 discusses this in section 7, "Printing rules", but the standard is not publicly available. |
|||
msg389529 - (view) | Author: Dominic Davis-Foster (domdfcoding) * | Date: 2021-03-25 21:04 | |
ISO 80000-1:2009 recommends groups of three digits either side of the decimal sign. |
|||
msg389534 - (view) | Author: Eric V. Smith (eric.smith) * ![]() |
Date: 2021-03-26 01:40 | |
If we do anything for float, we should do the same for decimal.Decimal. |
|||
msg389546 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-26 11:58 | |
> If we do anything for float, we should do the same for decimal.Decimal. and complex ;-) |
|||
msg389547 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-26 12:09 | |
How backward incompatible and annoying would it be to modify the behavior of the existing "_f" format? Do you see use cases which only want to group digits in the integer part but not the fractional part? According to https://discuss.python.org/t/add-underscore-as-a-thousandths-separator-for-string-formatting/7407 discussion, grouping digits was first designed for integers, and the fractional part of floats was simply ignored/forgotten. I mean, it doesn't sound like a deliberate choice to not group digits in the fractional part. The advantage of changing "_f" format is to keep backward compatibility: Python 3.9 and older would not group digits in the fractional part, but at least they don't fail with an error. If you write code with "_._f" format, you need a fallback code path for Python 3.9 and older: if sys.version_info >= (3, 10): text = f"my {...} very {...} long {...} and {...} complex {...} format string: x={x:_._f}" else: text = f"my {...} very {...} long {...} and {...} complex {...} format string: x={x:_f}" Or: text = f"my {...} very {...} long {...} and {...} complex {...} format string:" + (f"x={x:_f}" if sys.version_info >= (3, 10) else "x={x:_f}") Or many other variants. The main drawback is the risk to break tests relying on the exact output. About the separator character and the number of digits per group, IMO there is no standard working in all countries and all languages. But since we have a strict rule of 3 digits with "_" separator, I am fine with doing the same for the fractional part. It's an "arbitrary" choice, but at least, it's consistent. People wanting a different format per locale/language should write their own function. Once enough people will agree on such API, we can consider to add it to the stdlib. But for now, IMO 3 digits with "_" is good enough. By the way, I agree that it's hard to read numbers with many digits in the decimal part ;-) >>> f"{1/7:_.30f}" '0.142857142857142849212692681249' >>> f"{10**10+1/7:_.10f}" '10_000_000_000.1428565979' |
|||
msg389574 - (view) | Author: Terry Davis (Terry Davis) | Date: 2021-03-26 23:52 | |
Good point Victor, though I wonder how likely it is that a person using 3.10 would only use this particular new feature, and have an otherwise backwards-compatible codebase. This isn't something that I asked about out of necessity, and there hasn't been any other discussion of this idea that anyone can remember. On the other hand, I suppose it would be possible to have a feature flag that can be used to disable decimal underscores in 3.10 to prevent test failures. Just spitballing... |
|||
msg389687 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-29 12:06 | |
> On the other hand, I suppose it would be possible to have a feature flag that can be used to disable decimal underscores in 3.10 to prevent test failures. Just spitballing... I wrote PEP 606 -- Python Compatibility Version https://www.python.org/dev/peps/pep-0606/ and it was rejected. |
|||
msg389708 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2021-03-29 15:34 | |
I prefer Terry's original proposal which is backwards compatible and gives the user control over whether separator is to be applied to the fractional component. >>> format(12_34_56.12_34_56, '_._f') # Whole and fractional '123_456.123_456' >>> format(12_34_56.12_34_56, '_.f') # Fractional component only '123_456.123456' |
|||
msg389709 - (view) | Author: Eric V. Smith (eric.smith) * ![]() |
Date: 2021-03-29 15:37 | |
I agree with Raymond. We can't make a change that would modify existing program output. Which is unfortunate, but such is life. And I'd prefer to see groupings of 5 on the right, but I realize I might be in the minority. |
|||
msg389735 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-29 20:16 | |
'_.f' would be the same as '_f'? Should "._f" be allowed to only add underscores in the fractional part? (for consistency?) |
|||
msg389736 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-29 20:18 | |
Raymond: > I prefer Terry's original proposal which is backwards compatible (...) Well ok, that's what I expected. Backward compatibility usually wins all other arguments in Python :-) But I had to ask the question :-) |
|||
msg389744 - (view) | Author: Terry Davis (Terry Davis) | Date: 2021-03-29 20:39 | |
Victor, > '_.f' would be the same as '_f'? No, the example in my original post is wrong, '_.f' isn't allowed now. The proposal should use '_f' to describe the current behavior. > Should "._f" be allowed to only add underscores in the fractional part? (for consistency?) Yes, but not for consistency with the above usage, instead it's so both fractional and integral underscores can be specified on their own. Here is my attempt at updating the format spec. The only problem I have with it is that it allows a naked '.'; I don't know how to specify "dot must be followed by one or both of 'float_grouping' and 'precision'". Current: format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type] Proposed: format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.[float_grouping][precision]][type] fill ::= <any character> align ::= "<" | ">" | "=" | "^" sign ::= "+" | "-" | " " width ::= digit+ grouping_option ::= "_" | "," float_grouping ::= "_" precision ::= digit+ type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" |
|||
msg389754 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-03-29 21:12 | |
I'm now confused. Would you mind to give examples of all proposed formats and the expected output? |
|||
msg389762 - (view) | Author: Terry Davis (Terry Davis) | Date: 2021-03-29 22:24 | |
Current behavior: >>> format(1234.1234, '_f') '1_234.123400' >>> format(1234.1234, ',f') '1,234.123400' New behavior: >>> format(1234.1234, ',._f') '1,234.123_400' >>> format(1234.1234, '_._f') '1_234.123_400' >>> format(1234.1234, '._f') '1234.123_400' >>> format(1234.1234, '._4f') '1234.123_4' >>> format(1234.1234, '.f') # still not allowed '1234.123_4' >>> format(1234.1234, '_.f') # still not allowed |
|||
msg392911 - (view) | Author: Terry Davis (Terry Davis) | Date: 2021-05-04 15:37 | |
If no one else has any comments, I'll assume there is consensus and start working on this. I have not contributed to CPython before, nor have I worked on production C code, so it may be a while before I get anywhere. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:43 | admin | set | github: 87790 |
2021-11-06 09:49:54 | serhiy.storchaka | unlink | issue45708 superseder |
2021-11-05 13:21:33 | serhiy.storchaka | link | issue45708 superseder |
2021-05-04 15:37:03 | Terry Davis | set | messages:
+ msg392911 versions: + Python 3.11, - Python 3.10 |
2021-03-29 22:24:52 | Terry Davis | set | messages: + msg389762 |
2021-03-29 21:12:30 | vstinner | set | messages: + msg389754 |
2021-03-29 20:39:49 | Terry Davis | set | messages: + msg389744 |
2021-03-29 20:18:27 | vstinner | set | messages: + msg389736 |
2021-03-29 20:16:46 | vstinner | set | messages: + msg389735 |
2021-03-29 15:37:48 | eric.smith | set | messages: + msg389709 |
2021-03-29 15:34:44 | rhettinger | set | messages: + msg389708 |
2021-03-29 12:06:58 | vstinner | set | messages: + msg389687 |
2021-03-26 23:52:26 | Terry Davis | set | messages: + msg389574 |
2021-03-26 12:09:26 | vstinner | set | messages: + msg389547 |
2021-03-26 11:58:04 | vstinner | set | messages: + msg389546 |
2021-03-26 01:40:45 | eric.smith | set | messages: + msg389534 |
2021-03-26 01:38:46 | eric.smith | set | nosy:
+ eric.smith |
2021-03-25 21:04:42 | domdfcoding | set | nosy:
+ domdfcoding messages: + msg389529 |
2021-03-25 19:22:50 | vstinner | set | nosy:
+ vstinner |
2021-03-25 18:45:13 | rhettinger | set | messages: + msg389517 |
2021-03-25 17:56:02 | rhettinger | set | nosy:
+ rhettinger, mark.dickinson, serhiy.storchaka messages: + msg389512 |
2021-03-25 17:19:07 | Terry Davis | create |