Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pprint numbers with underscore #87080

Closed
felipeochoa mannequin opened this issue Jan 12, 2021 · 17 comments
Closed

pprint numbers with underscore #87080

felipeochoa mannequin opened this issue Jan 12, 2021 · 17 comments
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@felipeochoa
Copy link
Mannequin

felipeochoa mannequin commented Jan 12, 2021

BPO 42914
Nosy @rhettinger, @gpshead, @mdickinson, @ericvsmith, @serhiy-storchaka, @felipeochoa, @sblondon, @miss-islington, @wkeithvan
PRs
  • bpo-42914: pprint.pprint function displays integer with underscores #24864
  • Add bpo-42914 to What's New #25124
  • [3.10] Add bpo-42914 to What's New (GH-25124) #26509
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-03-24.08:28:50.926>
    created_at = <Date 2021-01-12.23:21:21.278>
    labels = ['type-feature', 'library', '3.10']
    title = 'pprint numbers with underscore'
    updated_at = <Date 2021-10-09.09:15:33.388>
    user = 'https://github.com/felipeochoa'

    bugs.python.org fields:

    activity = <Date 2021-10-09.09:15:33.388>
    actor = 'sblondon'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-03-24.08:28:50.926>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2021-01-12.23:21:21.278>
    creator = 'fov'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42914
    keywords = ['patch']
    message_count = 17.0
    messages = ['384982', '384985', '384992', '385025', '385034', '387677', '388390', '388693', '389205', '389319', '389439', '389440', '394979', '394981', '403225', '403226', '403525']
    nosy_count = 9.0
    nosy_names = ['rhettinger', 'gregory.p.smith', 'mark.dickinson', 'eric.smith', 'serhiy.storchaka', 'fov', 'sblondon', 'miss-islington', 'wkeithvan']
    pr_nums = ['24864', '25124', '26509']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue42914'
    versions = ['Python 3.10']

    @felipeochoa
    Copy link
    Mannequin Author

    felipeochoa mannequin commented Jan 12, 2021

    It would be nice if pprint learned to insert underscores in long numbers

    Current behavior:

    >>> pprint.pprint(int(1e9))
    1000000000

    Desired behavior

    >>> pprint.pprint(int(1e9))
    1_000_000_000

    Wikipedia tells me that "groups of 3" is the international standard to be followed here [1][2]

    [1] https://en.wikipedia.org/wiki/ISO_31-0#Numbers
    [2] https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping

    @felipeochoa felipeochoa mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 12, 2021
    @felipeochoa
    Copy link
    Mannequin Author

    felipeochoa mannequin commented Jan 12, 2021

    Here is an implementation of the safe repr for numbers if helpful:

    def safe_repr_int(object):
        sign = ''
        if object < 0:
            sign = '-'
            object  = -object
        r = repr(object)
        if len(r) <= 4:
            return sign + r
        parts = [sign]
        left = len(r) % 3
        if left:
            parts.append(r[0:left])
            parts.append('_')
            r = r[left:]
        parts.append(r[0:3])
        for i in range(3, len(r), 3):
            parts.append('_')
            parts.append(r[i:i + 3])
        return ''.join(parts)
    

    @rhettinger
    Copy link
    Contributor

    It would be nice if pprint learned to insert underscores in long numbers

    +1 but I would make this optional.

    Here is an implementation of the safe repr for numbers if helpful

    I suggest using the existing string formatting tools as a foundation

        >>> format(10**9, ',d').replace(',', '_')
        '1_000_000_000'

    @ericvsmith
    Copy link
    Member

    +1 also. I agree with Raymond it should be optional.

    @serhiy-storchaka
    Copy link
    Member

    >>> format(10**9, '_d')
    '1_000_000_000'

    @sblondon
    Copy link
    Mannequin

    sblondon mannequin commented Feb 25, 2021

    I add the same idea but later than you, so I'm interested by such feature.

    Felipe: do you want to add a pull request to this issue (with Serhiy Storchaka implementation because it's the simplest one)?

    If not, I plan to write it.
    I will write it too if there is no reply in one month.

    @felipeochoa
    Copy link
    Mannequin Author

    felipeochoa mannequin commented Mar 9, 2021

    All yours! I'm tied up so won't be able to submit the PR

    On Thu, 25 Feb 2021 at 10:12, Stéphane Blondon <report@bugs.python.org>
    wrote:

    Stéphane Blondon <stephane.blondon@gmail.com> added the comment:

    I add the same idea but later than you, so I'm interested by such feature.

    Felipe: do you want to add a pull request to this issue (with Serhiy
    Storchaka implementation because it's the simplest one)?

    If not, I plan to write it.
    I will write it too if there is no reply in one month.

    ----------
    nosy: +sblondon


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue42914\>


    @sblondon
    Copy link
    Mannequin

    sblondon mannequin commented Mar 14, 2021

    Thank you Felipe for the news! :)
    I have committed a PR about this issue.

    Two remarks:

    • I changed the proposed implementation from 'format(integer, '_d')' to '{:_d}.format(integer)' because the first way raised an exception. (The format function was not defined.)
    • I thought about adding the same behavior for float too but I didn't add it because the 'f' type uses a precision of 6 digits after the decimal point for float. So it's possible some precision would be lost with the pprint() call. It could mislead users more than helping them with the readability of the ''. A precision value can be added but I'm not sure it's a good idea. based on [1]

    As requested, there is a new parameter to disable this new behavior ('underscore_numbers').

    1: https://docs.python.org/3/library/string.html#format-specification-mini-language

    @rhettinger
    Copy link
    Contributor

    I don't think underscores can be on by default. It needs to be opt-in to be backwards compatible.

    @sblondon
    Copy link
    Mannequin

    sblondon mannequin commented Mar 22, 2021

    I changed the default to be backward compatible (so underscore_numbers=False).

    I think it would be better with underscore_numbers enabled by default but I understand the need for stability. Perhaps such break could be done in the future (in version 3.12 or v.4)?

    @gpshead
    Copy link
    Member

    gpshead commented Mar 24, 2021

    New changeset 3ba3d51 by sblondon in branch 'master':
    bpo-42914: add a pprint underscore_numbers option (GH-24864)
    3ba3d51

    @gpshead
    Copy link
    Member

    gpshead commented Mar 24, 2021

    Thanks for the contribution Stéphane!

    I agree that this would be a nice default. We're just being conservative in the pace of default behavior changes. Changing the default could be considered in the future after a few releases with this parameter have shipped.

    @miss-islington
    Copy link
    Contributor

    New changeset 4846ea9 by Wm. Keith van der Meulen in branch 'main':
    Add bpo-42914 to What's New (GH-25124)
    4846ea9

    @miss-islington
    Copy link
    Contributor

    New changeset 4131780 by Miss Islington (bot) in branch '3.10':
    Add bpo-42914 to What's New (GH-25124)
    4131780

    @sblondon
    Copy link
    Mannequin

    sblondon mannequin commented Oct 5, 2021

    Python 3.10 has now been released with the underscore_numbers parameter.
    I wonder which release could enable the parameter by default (so it would break the previous behavior):

    • the next release (3.11) is probably too short.
    • the safest strategy is to wait until 3.9 will be end-of-life (2025-10 according to [1]). In such case, it could be integrated in 3.14.

    Could it be accepted before (like 3.12 or 3.13)?

    If there is no reply, I will create a new issue and PR for 3.14 inclusion ( = safest strategy).

    1: https://devguide.python.org/#status-of-python-branches

    @ericvsmith
    Copy link
    Member

    The safest thing to do is never make it the default. It would always be an opt-in behavior.

    @sblondon
    Copy link
    Mannequin

    sblondon mannequin commented Oct 9, 2021

    Ok, I will not send a PR to change the current behavior until python4 (in case it exists one day).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants