Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify numeric padding behavior in string formatting #82838

Closed
JamoBox mannequin opened this issue Oct 31, 2019 · 11 comments
Closed

Clarify numeric padding behavior in string formatting #82838

JamoBox mannequin opened this issue Oct 31, 2019 · 11 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error

Comments

@JamoBox
Copy link
Mannequin

JamoBox mannequin commented Oct 31, 2019

BPO 38657
Nosy @mdickinson, @ericvsmith, @serhiy-storchaka, @vedgar, @miss-islington, @JamoBox
PRs
  • bpo-38657: Clarify numeric padding behaviour in string formatting #17036
  • [3.8] bpo-38657: Clarify numeric padding behaviour in string formatting (GH-17036) #18587
  • [3.7] bpo-38657: Clarify numeric padding behaviour in string formatting (GH-17036) #18588
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericvsmith'
    closed_at = <Date 2020-02-23.13:24:35.238>
    created_at = <Date 2019-10-31.20:36:00.421>
    labels = ['3.8', 'type-bug', '3.7', '3.9', 'docs']
    title = 'Clarify numeric padding behavior in string formatting'
    updated_at = <Date 2020-02-23.13:24:35.237>
    user = 'https://github.com/JamoBox'

    bugs.python.org fields:

    activity = <Date 2020-02-23.13:24:35.237>
    actor = 'mark.dickinson'
    assignee = 'eric.smith'
    closed = True
    closed_date = <Date 2020-02-23.13:24:35.238>
    closer = 'mark.dickinson'
    components = ['Documentation']
    creation = <Date 2019-10-31.20:36:00.421>
    creator = 'Wicken'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38657
    keywords = ['patch']
    message_count = 11.0
    messages = ['355767', '355794', '355814', '355825', '355834', '355870', '355880', '362380', '362381', '362382', '362507']
    nosy_count = 6.0
    nosy_names = ['mark.dickinson', 'eric.smith', 'serhiy.storchaka', 'veky', 'miss-islington', 'Wicken']
    pr_nums = ['17036', '18587', '18588']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue38657'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @JamoBox
    Copy link
    Mannequin Author

    JamoBox mannequin commented Oct 31, 2019

    When formatting an integer as a hexadecimal value, the '#' alternate form modifier inserts a preceding '0x'.
    If this is used in combination with padding modifiers, the '0x' is counted as part of the overall width, which does not feel like the natural behaviour as extra calculation is required to get the correct post '0x' precision.

    Example:

    In [7]: f'{num:04x}'
    Out[7]: '0800'

    In [8]: f'{num:#04x}'
    Out[8]: '0x800'

    To get the hexadecimal representation padded to 4 digits, you have to account for the preceding 0x:

    In [10]: f'{num:#06x}'
    Out[10]: '0x0800'

    @JamoBox JamoBox mannequin added 3.7 (EOL) end of life 3.8 only security fixes type-bug An unexpected behavior, bug, or error labels Oct 31, 2019
    @serhiy-storchaka
    Copy link
    Member

    Yes, the width is the width of the formatted value, not the number of digits.

    What is your proposition?

    @ericvsmith
    Copy link
    Member

    int.__format__ inherits this from %-formatting, which inherits it from C's printf.

    There's no way we're going to change this at this point: the breakage would be too great. So, I'm going to reject this.

    @ericvsmith ericvsmith self-assigned this Nov 1, 2019
    @ericvsmith
    Copy link
    Member

    Now that I re-read this, maybe it was a documentation request, not a functional change? I'd be okay with documenting the existing behavior, so I'll re-open this and change the type. Patches welcome.

    @ericvsmith ericvsmith added docs Documentation in the Doc dir 3.9 only security fixes labels Nov 1, 2019
    @ericvsmith ericvsmith reopened this Nov 1, 2019
    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Nov 1, 2019

    The width doesn't mean "the number of bits", it means "the width of the field". In every other case too:

    • when we format negative numbers, width includes the minus sign
    • when we format decimal numbers, width includes decimal point (or comma)
    • when we format strings with !r, width includes the quotes

    So, not only would it break too much code, but it would actually be inconsistent to formatting all other types currently.

    @JamoBox
    Copy link
    Mannequin Author

    JamoBox mannequin commented Nov 2, 2019

    Given the comments above I appreciate that this is actually due to the padding being the total field width rather than the padding of the digits themselves. Having revised the documentation again, I believe this following line is explaining it:

    "When no explicit alignment is given, preceding the width field by a zero ('0') character enables sign-aware zero-padding for numeric types. This is equivalent to a fill character of '0' with an alignment type of '='."

    (https://docs.python.org/3.8/library/string.html)

    I initially read "sign-aware zero-padding for numeric types" to mean the padding would not blindly prepend, and would take into account any signs and pad after (hence initially making this a bug). So maybe as suggested above we should explicitly mention the padding is the total number of characters in the field, rather than just the numbers.

    I can look into adding this soon and see what you all think.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Nov 2, 2019

    It seems that you're confusing two things that really don't have much in common.

    • (field) width is a _number_, saying how many characters (at least) should the formatted output take.
    • padding is a bool (or maybe a char), saying what should be put inside the leftover space if the default formatted output is shorter than the width

    The padding is not the width, and the width is not the padding. Once you start to differentiate those two things, I'm convinced all your confusions will disappear.

    @ericvsmith ericvsmith changed the title String format for hexadecimal notation breaks padding with alternative form Clarify numeric padding behavior in string formatting Nov 3, 2019
    @miss-islington
    Copy link
    Contributor

    New changeset 424e568 by Pete Wicken in branch 'master':
    bpo-38657: Clarify numeric padding behaviour in string formatting (GH-17036)
    424e568

    @miss-islington
    Copy link
    Contributor

    New changeset 09db1da by Miss Islington (bot) in branch '3.7':
    bpo-38657: Clarify numeric padding behaviour in string formatting (GH-17036)
    09db1da

    @miss-islington
    Copy link
    Contributor

    New changeset a207512 by Miss Islington (bot) in branch '3.8':
    bpo-38657: Clarify numeric padding behaviour in string formatting (GH-17036)
    a207512

    @mdickinson
    Copy link
    Member

    It looks as though this has been addressed, and can be closed.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants