Issue45995
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-12-06 12:03 by John Belmonte, last changed 2022-04-11 14:59 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 30049 | open | jbelmonte, 2021-12-11 13:24 |
Messages (23) | |||
---|---|---|---|
msg407792 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-06 12:03 | |
proposal: add a string formatting option to normalize negative 0 values to 0 use case: rounded display of a float that is nominally 0, where the distraction of a flashing minus sign from minute changes around 0 is unwanted example: >>> '%~5.1f' % -.00001 ' 0.0' format spec before: format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type] after: format_spec ::= [[fill]align][sign][~][#][0][width][grouping_option][.precision][type] where '~' is only allowed for number types implementation: if '~' is present in the spec, add 0 to the value after applying precision |
|||
msg407793 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2021-12-06 12:15 | |
To normalize negative 0.0 to 0.0 you can just add 0.0. It will work with any method of converting floats to string, there is no need to change all formatting specifications. >>> x = -0.0 >>> x -0.0 >>> x + 0.0 0.0 |
|||
msg407796 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-06 12:30 | |
> To normalize negative 0.0 to 0.0 you can just add 0.0. yes, I'm aware-- see "implementation" in the original post > there is no need to change all formatting specifications For adding 0 to work, it must be done after the rounding. That means if you want make use of anything in the current formatting spec regarding precision, normalizing negative zero would need to be a proper option of the formatting spec. |
|||
msg407810 - (view) | Author: Steven D'Aprano (steven.daprano) * ![]() |
Date: 2021-12-06 14:18 | |
It was decided long ago that % formatting would not be enhanced with new features. I think that it is supposed to match the standard C formatting codes, and nothing else. So this is unlikely to be approved for % formatting. It *might* be approved for the format method and f-strings. (I suspect that unless you get immediate and uncontroversial agreement from multiple core developers, you may need to take it for further discussion and perhaps even a PEP.) What you call a "distraction" I consider to be critical part of the display. The numbers you are displaying actually are negative, and rounding them for display does not change that. So they ought to show the minus sign. So I'm not really very sympathetic to this feature request. |
|||
msg407822 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-06 15:10 | |
Here is the same proposal made for C++ `std::format`: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1496r2.pdf It makes fair arguments for the feature's use, and explains why the problem is hard to work around. It was withdrawn by the author for C++20, but a consensus proposal is promised for C++23. |
|||
msg407857 - (view) | Author: Mark Dickinson (mark.dickinson) * ![]() |
Date: 2021-12-06 19:03 | |
I'd support having this functionality available for `format` and for f-strings. (As Steven says, changing %-formatting doesn't seem viable.) It really _is_ awkward to do this in any other way, and I'm reliably informed that normal people don't expect to see negative zeros in formatted numeric output. It did take me a few minutes to get my head around the idea that `f"{-0.01:+.1f}"` would return `"+0.0"` rather than `"-0.0"` or `" 0.0"` or just plain `"0.0"` under this proposal, but I agree that it seems like the only thing that can be consistent and make sense. I'm not 100% convinced by the particular spelling proposed, but I don't have anything better to suggest. If C++ might be going with a "z", would it make sense to do the same for Python? I don't forsee any implementation difficulties for float and complex types. For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that. Once we've done that, again the implementation doesn't seem onerous. |
|||
msg407905 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2021-12-07 09:54 | |
Well, it makes sense for negative zero produced by rounding. But if we add a special support for this case, it would be useful to have some control on the type of rounding. Currently floats are rounded to the nearest decimal number, but in some cases it would be better to round up, down, toward zero or infinity (seed for example issue44884). You can round explicitly before formatting, but this solution is also applicable for this issue: >>> '%5.1f' % (round(-.00001, 1) + 0.0) ' 0.0' |
|||
msg407928 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-07 12:22 | |
> changing %-formatting doesn't seem viable I'm concerned about treating %-formatting specially. As far as float/complex, the logical and efficient place to put this change seems to be PyOS_double_to_string(), which affects all three formatting options. For example, the dtoa case is as simple as this change to format_float_short(): /* coerce negative zero to positive */ if (sign == 1 && ((digits_len == 0 && decpt == -1) || (digits_len == 1 && digits[0] == '0'))) { sign = 0; } |
|||
msg407929 - (view) | Author: Steven D'Aprano (steven.daprano) * ![]() |
Date: 2021-12-07 12:53 | |
Sorry John, I don't understand your comment about "treating %-formatting specifically". Isn't the point here not to change %-formatting at all? |
|||
msg407930 - (view) | Author: Eric V. Smith (eric.smith) * ![]() |
Date: 2021-12-07 12:55 | |
%-formatting already doesn't support some formats that float.__format__ does, for example ','. So I agree we shouldn't modify %-formatting. I don't have much of an opinion on whether changing __format__ is a good idea or not. |
|||
msg407934 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-07 13:14 | |
I see now. PyOS_double_to_string() could gain the extra flag to coerce negative zero but, out of the three formatting methods, only format() and f-string would use the flag. |
|||
msg407973 - (view) | Author: Eric V. Smith (eric.smith) * ![]() |
Date: 2021-12-07 21:48 | |
PyOS_double_to_string is part of the stable ABI. I don't recall if we're allowed to add new bitfield flags to a stable ABI function. We'd use a new Py_DTSF_NORMALIZE_NEGATIVE_0 flag for this feature. I suspect we can't add a flag, due to comparability reasons (new code setting the flag being used in old versions of python without it), but we'd need to research. I saw a similar discussion within the last few years, but of course now I can't find it. Maybe old versions would correctly ignore the new bit being set. This proposal becomes less interesting if we'd need to add a new function to support it. Although I guess we could do something that's internal-only. |
|||
msg407990 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-08 01:51 | |
I'll share a draft PR soon (excluding Decimal), so far it's still looking straightforward. > Maybe old versions would correctly ignore the new bit being set. That's one of the benefits of using bit flags in an ABI: backward-compatible extensibility. (An implementation can defeat it by intentionally failing if unknown bits are encountered, but the code in question doesn't appear to be doing this.) > You can round explicitly before formatting > > >>> '%5.1f' % (round(-.00001, 1) + 0.0) Yes, I have experience with it. 1. even as a one-off, it's questionable. If someone accidentally changes the precision in only one of the spec string or round call, that's a bug. 2. since applications and libraries may pass around format specs, and because of (1), you'll try to make a programmatic solution. Now you're parsing format spec strings. 3. while a programmatic solution can be done for a function API like format() that takes a separate spec and value, there is no sane way to wrap f-strings |
|||
msg408367 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-12 08:31 | |
> For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that. After some digging, I believe this is the background on forking pieces of mpdecimal (and why the existing source copy inside Python doesn't count as a fork): https://bugs.python.org/issue45708#msg405895 https://github.com/python/cpython/pull/29438 If I understand correctly, the PR for supporting underscore separators in Decimal formatting is only taking control of generating a mpd_spec_t from the spec string. Formatting itself is still done by mpd_qformat_spec(). So there's outstanding work to also pull the formatting code itself into _decimal.c. (And this is wanted anyway to reconcile existing libmpdec formatting modifications: https://github.com/python/cpython/commit/298131a44896a4fec1ea829814ad52409d59aba5) And this is all because vendors have the crazy practice of unbundling libmpdec from Python. (If a project is bundling the source of another, there may be some reason...?) |
|||
msg408413 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-12 23:52 | |
potential short-term solution for Decimal: if negative zero option is set and sign is negative: pre-round into a temp using mpd_qrescale() if mpd_iszero(temp): change sign to positive |
|||
msg408529 - (view) | Author: John Belmonte (John Belmonte) | Date: 2021-12-14 13:25 | |
implemented float and Decimal-- PR is ready for review |
|||
msg408800 - (view) | Author: Mark Dickinson (mark.dickinson) * ![]() |
Date: 2021-12-17 16:49 | |
Thanks, John. I should have time to review within the next week or so. |
|||
msg410838 - (view) | Author: John Belmonte (John Belmonte) | Date: 2022-01-18 00:45 | |
Mark, would you give it a review this month? (PR has been marked stale.) |
|||
msg411243 - (view) | Author: Mark Dickinson (mark.dickinson) * ![]() |
Date: 2022-01-22 10:00 | |
[John] > Mark, would you give it a review this month? Apologies; my holiday-break free time was nobbled from unexpected quarters. I can't promise to find time this month, but I can promise to try. I did at least skim through the PR, and while there are still likely some iterations needed I'm satisfied that this is technically feasible. But I'm afraid that's the easy part. If this is to go in, the other problem we still have to solve is achieving some consensus among the core developers that this is worth doing. Right now, judging by comments on this issue, I think I'm the only core dev who thinks this is a good idea; others are lukewarm at best, and I'm not willing to unilaterally approve and merge these changes without something closer to a consensus. There are a couple of ways forward here: - Post the proposal on python-ideas to get wider visibility and feedback. If everyone agrees this is a great idea (from experience, this seems an unlikely outcome), then we can go ahead and merge. Otherwise we'd likely need a PEP to move forward. - Bypass the python-ideas step, write the PEP, discuss in the appropriate forums, and then submit to the SC for approval / rejection. - Convince Eric Smith. :-) With apologies to Eric for singling him out: Eric could reasonably be described as the steward/maintainer of the formatting machinery, so if he's persuaded, that's good enough for me. The fact that you've already created a working implementation so that people can experiment is a bonus when it comes to trying to sell this to others. I don't have the bandwidth to write a PEP, but I would be happy to act as PEP sponsor. |
|||
msg411295 - (view) | Author: Eric V. Smith (eric.smith) * ![]() |
Date: 2022-01-22 22:17 | |
Wow, thanks, Mark! I'm generally in favor. The selling points to me are that it needs to happen post-rounding, and the C++ discussion. It would be better if this were already accepted in C++. I'll note that the paper is proposing a 'z' modifier to the sign, so I guess for us that would translate to: [sign[optional-z]] instead of just sign. I'd have to noodle through the differences between that the proposed [sign][~]. I guess this would all be worked out in a PEP. My only reservation is Mark's comment: """For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that.""" If Mark is okay with that (right back at you, Mark!), then I think a PEP is the next step. It doesn't need to be huge, sort of like PEP 378. |
|||
msg411302 - (view) | Author: John Belmonte (John Belmonte) | Date: 2022-01-22 23:18 | |
Thank you Mark and Eric. > I'll note that the paper is proposing a 'z' modifier to the sign, so I guess for us that would translate to: [sign[optional-z]] instead of just sign. I'd have to noodle through the differences between that the proposed [sign][~]. The C++ paper proposes [sign][z] (i.e. you can have the `z` alone without an explicit +/-), and this is what I implemented in the Python PR. My original proposal with tilde was discarded. > My only reservation is Mark's comment: """For Decimal, we'd need to "own" the string formatting, taking that responsibility away from mpdecimal, but there are already other reasons to do that.""" In the PR I was able to avoid taking that on by preprocessing the format string before handing it to mpdecimal. The code was already doing such things to handle the NULL fill character. |
|||
msg412360 - (view) | Author: John Belmonte (John Belmonte) | Date: 2022-02-02 13:54 | |
PEP at https://github.com/python/peps/pull/2295 |
|||
msg415034 - (view) | Author: Mark Dickinson (mark.dickinson) * ![]() |
Date: 2022-03-13 12:04 | |
I forgot to update here: > PEP at https://github.com/python/peps/pull/2295 For the record, PEP 682 has been accepted. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:53 | admin | set | github: 90153 |
2022-03-13 12:04:51 | mark.dickinson | set | messages: + msg415034 |
2022-02-02 13:54:50 | John Belmonte | set | messages: + msg412360 |
2022-01-22 23:18:50 | John Belmonte | set | messages: + msg411302 |
2022-01-22 22:17:16 | eric.smith | set | messages: + msg411295 |
2022-01-22 10:00:59 | mark.dickinson | set | messages: + msg411243 |
2022-01-18 00:45:26 | John Belmonte | set | messages: + msg410838 |
2021-12-17 16:49:58 | mark.dickinson | set | assignee: mark.dickinson |
2021-12-17 16:49:46 | mark.dickinson | set | messages: + msg408800 |
2021-12-14 13:25:50 | John Belmonte | set | messages: + msg408529 |
2021-12-12 23:52:33 | John Belmonte | set | messages: + msg408413 |
2021-12-12 08:31:27 | John Belmonte | set | messages: + msg408367 |
2021-12-11 13:24:12 | jbelmonte | set | keywords:
+ patch nosy: + jbelmonte pull_requests: + pull_request28274 stage: patch review |
2021-12-08 01:51:58 | John Belmonte | set | messages: + msg407990 |
2021-12-07 21:48:16 | eric.smith | set | messages: + msg407973 |
2021-12-07 13:14:12 | John Belmonte | set | messages: + msg407934 |
2021-12-07 12:55:35 | eric.smith | set | messages: + msg407930 |
2021-12-07 12:53:02 | steven.daprano | set | messages: + msg407929 |
2021-12-07 12:22:14 | John Belmonte | set | messages: + msg407928 |
2021-12-07 09:54:42 | serhiy.storchaka | set | messages: + msg407905 |
2021-12-06 19:03:35 | mark.dickinson | set | messages: + msg407857 |
2021-12-06 18:16:16 | mark.dickinson | set | nosy:
+ eric.smith |
2021-12-06 15:10:31 | John Belmonte | set | messages: + msg407822 |
2021-12-06 14:18:04 | steven.daprano | set | nosy:
+ steven.daprano messages: + msg407810 versions: + Python 3.11 |
2021-12-06 12:30:06 | John Belmonte | set | messages: + msg407796 |
2021-12-06 12:15:51 | serhiy.storchaka | set | nosy:
+ mark.dickinson, serhiy.storchaka messages: + msg407793 |
2021-12-06 12:03:46 | John Belmonte | create |