Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str.format() wrongly formats complex() numbers (Py30a2) #45929

Closed
mark-summerfield mannequin opened this issue Dec 11, 2007 · 39 comments
Closed

str.format() wrongly formats complex() numbers (Py30a2) #45929

mark-summerfield mannequin opened this issue Dec 11, 2007 · 39 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@mark-summerfield
Copy link
Mannequin

mark-summerfield mannequin commented Dec 11, 2007

BPO 1588
Nosy @gvanrossum, @mdickinson, @ericvsmith, @devdanzin, @mark-summerfield
Files
  • issue-1588-trunk.patch
  • issue-1588-py3k.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericvsmith'
    closed_at = <Date 2009-04-30.01:01:40.337>
    created_at = <Date 2007-12-11.13:30:51.526>
    labels = ['interpreter-core', 'type-feature']
    title = 'str.format() wrongly formats complex() numbers (Py30a2)'
    updated_at = <Date 2009-05-04.11:39:04.184>
    user = 'https://github.com/mark-summerfield'

    bugs.python.org fields:

    activity = <Date 2009-05-04.11:39:04.184>
    actor = 'mark.dickinson'
    assignee = 'eric.smith'
    closed = True
    closed_date = <Date 2009-04-30.01:01:40.337>
    closer = 'eric.smith'
    components = ['Interpreter Core']
    creation = <Date 2007-12-11.13:30:51.526>
    creator = 'mark'
    dependencies = []
    files = ['13807', '13808']
    hgrepos = []
    issue_num = 1588
    keywords = ['patch']
    message_count = 39.0
    messages = ['58428', '58447', '58448', '58483', '58484', '58496', '86640', '86651', '86652', '86656', '86679', '86680', '86681', '86682', '86683', '86686', '86717', '86718', '86719', '86720', '86721', '86722', '86724', '86725', '86726', '86727', '86731', '86732', '86755', '86766', '86772', '86827', '86829', '86836', '86964', '86972', '86973', '86976', '87116']
    nosy_count = 5.0
    nosy_names = ['gvanrossum', 'mark.dickinson', 'eric.smith', 'ajaksu2', 'mark']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1588'
    versions = ['Python 3.1', 'Python 2.7']

    @mark-summerfield
    Copy link
    Mannequin Author

    mark-summerfield mannequin commented Dec 11, 2007

    >>> x = complex(1, 2/3)
    >>> "{0} {0:.5}".format(x)
    '(1+0.666666666667j) (1+0.'

    The complex number is being formatted as if it were a string and simply
    truncated to 5 characters. I would expect each part of the complex
    number to be formatted according to the format specifier, i.e., in the
    case of :.5 to both have 5 digits after the decimal point.

    @mark-summerfield mark-summerfield mannequin added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 11, 2007
    @gvanrossum
    Copy link
    Member

    This really is a feature request -- in Python 2.x there is no formatting
    code for complex numbers at all, and "%.5s" % complex(...) does the same
    thing.

    I agree it would be neat to have control over complex numbers using the
    same formatting language used for floats; but I note that it's easy
    enough to do this manually, e.g.

    >>> "{0.real:.5}+{0.imag:.5}j".format(z)
    '1+0.66667j'

    @gvanrossum
    Copy link
    Member

    Maybe this would be a good GHOP task?

    @mark-summerfield
    Copy link
    Mannequin Author

    mark-summerfield mannequin commented Dec 12, 2007

    On 2007-12-11, Guido van Rossum wrote:

    Guido van Rossum added the comment:

    This really is a feature request -- in Python 2.x there is no formatting
    code for complex numbers at all, and "%.5s" % complex(...) does the same
    thing.

    I thought Python 3 was meant to be an _improvement_:-)

    I agree it would be neat to have control over complex numbers using the
    same formatting language used for floats; but I note that it's easy
    enough to do this manually, e.g.

    >>> "{0.real:.5}+{0.imag:.5}j".format(z)

    '1+0.66667j'

    Good point, I'll use that.

    Thanks!

    @mark-summerfield
    Copy link
    Mannequin Author

    mark-summerfield mannequin commented Dec 12, 2007

    On 2007-12-11, Guido van Rossum wrote:
    > Guido van Rossum added the comment:
    >
    > This really is a feature request -- in Python 2.x there is no formatting
    > code for complex numbers at all, and "%.5s" % complex(...) does the same
    > thing.
    >
    > I agree it would be neat to have control over complex numbers using the
    > same formatting language used for floats; but I note that it's easy
    > enough to do this manually, e.g.
    >
    > >>> "{0.real:.5}+{0.imag:.5}j".format(z)
    >
    > '1+0.66667j'

    That's not quite right because it doesn't always handle the sign
    correctly and doesn't force float output. So I think it should be this:

    '1.00000+0.66667j'
    >>> "{0.real:.5f}{0.imag:+.5f}j".format(complex(1, -2/3))
    '1.00000-0.66667j'

    @gvanrossum
    Copy link
    Member

    I thought Python 3 was meant to be an _improvement_:-)

    That's why I didn't close the issue but reclassified it.

    Or did you expect me to implement it overnight? :-)

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Apr 27, 2009

    Confirmed in py3k at rev71995.

    @ericvsmith
    Copy link
    Member

    I agree this is a feature request. It comes down to:

    What should the format specifier mini-language for complex numbers look
    like?

    Should it look like the existing mini-language for floats, but have the
    format specified twice, with some sort of delimiter? Or just specified
    once, and use that for both parts?

    I'm sure python-ideas could argue over it for ages, but I don't see any
    outcome that's much of an improvement over the suggested:
    "{0.real:.5f}{0.imag:+.5f}j".format(complex(1, -2/3))

    @ericvsmith ericvsmith added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Apr 27, 2009
    @mdickinson
    Copy link
    Member

    What should the format specifier mini-language for complex numbers look
    like?
    Should it look like the existing mini-language for floats, but have
    the format specified twice, with some sort of delimiter?

    This sounds clumsy to me. I'd guess that in most uses you'd want the
    same format for both pieces.

    Or just specified once, and use that for both parts?

    That doesn't sound unreasonable. But there might need to be some
    thinking about exactly what a '+' modifier means, or how you pad with
    zeros on the left when you've got two pieces to pad.

    It seems simplest just to tell people to format the real and imaginary
    parts by hand. As it isn't totally obvious how to do this (e.g.,
    remembering the '+' for the imaginary part), perhaps there should be a
    recipe in the docs somewhere?

    @ericvsmith
    Copy link
    Member

    Mark Dickinson wrote:

    > What should the format specifier mini-language for complex numbers look
    > like?
    > Should it look like the existing mini-language for floats, but have
    > the format specified twice, with some sort of delimiter?

    This sounds clumsy to me. I'd guess that in most uses you'd want the
    same format for both pieces.

    I agree, and mostly I was just trying to spark some discussion and show
    how (absurdly) far we can take this.

    > Or just specified once, and use that for both parts?

    That doesn't sound unreasonable. But there might need to be some
    thinking about exactly what a '+' modifier means, or how you pad with
    zeros on the left when you've got two pieces to pad.

    How about this:

    • we have a single specifier with the same format as floats
    • we force the sign on the imaginary part to be '+', no
      matter what was specified
    • we add a 'j' after the imaginary part
    • we ignore any width specified (and therefor any alignment
      and padding)

    It seems simplest just to tell people to format the real and imaginary
    parts by hand. As it isn't totally obvious how to do this (e.g.,
    remembering the '+' for the imaginary part), perhaps there should be a
    recipe in the docs somewhere?

    When we document the above approach, we note the way to get full control
    as mentioned in a prior message.

    I guess we should put the docs in with string formatting (since that's
    where the other builtin types are documented), although really it
    belongs in complex.__format__ by itself. But I doubt anyone would find
    it there. Maybe we could to add a pointer from the string formatting to
    complex.__format__.

    @ericvsmith ericvsmith self-assigned this Apr 27, 2009
    @mdickinson
    Copy link
    Member

    How about this:

    • we have a single specifier with the same format as floats
    • we force the sign on the imaginary part to be '+', no
      matter what was specified
    • we add a 'j' after the imaginary part

    This sounds good to me. I assume a '+' would still affect
    the sign of the real part?

    • we ignore any width specified (and therefor any alignment
      and padding)

    I don't see any problem with dealing with width, alignment
    and padding with a user-specified fill character; I think we
    should keep these if possible. It's just zero padding where
    it's not clear what should happen.

    For the bits that are disabled (e.g., zero padding), should
    there be a ValueError raised, or do those bits just get
    silently ignored?

    @ericvsmith
    Copy link
    Member

    I don't see any problem with dealing with width, alignment
    and padding with a user-specified fill character; I think we
    should keep these if possible. It's just zero padding where
    it's not clear what should happen.

    You're correct. It's just zero padding that would be disabled.

    For the bits that are disabled (e.g., zero padding), should
    there be a ValueError raised, or do those bits just get
    silently ignored?

    I think a ValueError would be best. That way if we decide to give it some
    meaning in the future, we know it won't change any working code.

    @mdickinson
    Copy link
    Member

    More specifically, how about allowing widths, and the
    '<', '>' and '^' alignment specifiers, but not '=', or
    '0' for zero-padding.

    I suppose that thousands separators should be permitted
    here too? Though it's difficult to imagine anyone actually
    using them. If we allow ',' but not '0' then we avoid
    the crazy zero-padding--thousands-separators interaction.

    @ericvsmith
    Copy link
    Member

    This sounds good to me. I assume a '+' would still affect
    the sign of the real part?

    Forgot to reply to this part.

    Yes, a '+', '-', or ' ' would still affect the real part, but the
    imaginary part would always use '+'.

    @mdickinson
    Copy link
    Member

    I think a ValueError would be best. That way if we decide to give it
    some meaning in the future, we know it won't change any working code.

    Agreed. It also fits with the way that other non-numeric types seem to
    behave, as in:

    >>> format("boris", "030s")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: '=' alignment not allowed in string format specifier

    @ericvsmith
    Copy link
    Member

    More specifically, how about allowing widths, and the
    '<', '>' and '^' alignment specifiers, but not '=', or
    '0' for zero-padding.

    That sounds correct.

    I suppose that thousands separators should be permitted
    here too? Though it's difficult to imagine anyone actually
    using them. If we allow ',' but not '0' then we avoid
    the crazy zero-padding--thousands-separators interaction.

    That was my thinking, too.

    @ericvsmith
    Copy link
    Member

    I'm also going to disallow the '%' format code. I don't think it makes
    any sense to convert a complex number to a percentage.

    @mdickinson
    Copy link
    Member

    I'm also going to disallow the '%' format code.

    Sounds good to me.

    I don't think it makes any sense to convert a complex number to a
    percentage.

    Well, I think it's clear what the numbers would be (just scale both real
    and imaginary parts by 100 before using fixed-point formatting). The
    real issue whether to have two trailing '%'s or one.

    Just being difficult: I completely agree that '%' should be disallowed
    for complex numbers.

    @mdickinson
    Copy link
    Member

    Two things that haven't come up so far:

    (1) What about parentheses? The current complex repr and str have
    parentheses in them, for reasons that I still don't really understand.

    I'd suggest leaving them out altogether; except that I have
    the impression (perhaps wrongly) that an empty type code is
    supposed to correspond to str. And given that I don't understand
    why the parens were in there in the first place, I'm probably
    not a good person to judge whether they should stay in a
    formatted complex number.

    (2) What about zeros? The current repr and str leave out the real
    part (and the enclosing parens) if it's equal to zero. Should
    format do the same? I'd say not, except possibly again in the
    case where there's no type code.

    @ericvsmith
    Copy link
    Member

    Mark Dickinson wrote:

    (1) What about parentheses? The current complex repr and str have
    parentheses in them, for reasons that I still don't really understand.

    I'd suggest leaving them out altogether; except that I have
    the impression (perhaps wrongly) that an empty type code is
    supposed to correspond to str. And given that I don't understand
    why the parens were in there in the first place, I'm probably
    not a good person to judge whether they should stay in a
    formatted complex number.

    The rule is that if that x.__format__('') is equivalent to str(x). All
    of the built-in objects have a test for a zero-length format string and
    delegate to str(). But (3).__format__('-') does not call str(), despite
    the fact that it's the identical output. That's because the format
    string isn't zero-length. Instead, this is the case of the missing
    format "presentation type".

    I couldn't find a case with any built-in objects where this really makes
    a difference (although I can't say I spent a lot of time at it). Complex
    would be the first one. But that doesn't really bother me.

    format(1+1j, '') -> '(1+1j)'
    format(1+1j, '-') -> '1+1j'

    Although I guess if we wanted to, we could say that the empty
    presentation type is equivalent to 'g', but gives you parens. This would
    fit in nicely with bpo-5858, if it's accepted. Floats do something
    similar and special case the empty presentation type: '' is like 'g',
    but with at least one digit after the decimal point.

    (2) What about zeros? The current repr and str leave out the real
    part (and the enclosing parens) if it's equal to zero. Should
    format do the same? I'd say not, except possibly again in the
    case where there's no type code.

    I agree. Again, we could say that the empty presentation type is
    different in this regard.

    @mdickinson
    Copy link
    Member

    Complex would be the first one. But that doesn't really bother me.

    It bothers me a little. I see '' as a special case of the empty
    presentation type, even if that's not what a strict reading of
    PEP-3101 says, so I expect '', '>' '<20' all to format the
    number in the same way, and only differ in their treatment of
    alignment and padding. That is, adding a '>' to the start of a
    format specifier shouldn't change the formatting of the number
    itself. So from this perspective, it seems better if format(x, '')
    ends up doing the same thing as str(x) as a result of the
    choices made for the empty presentation type, rather than
    as a result of special-casing ''.

    Although I guess if we wanted to, we could say that the empty
    presentation type is equivalent to 'g', but gives you parens.

    This works for me.

    [about suppressing real zeros...]

    Again, we could say that the empty presentation type is
    different in this regard.

    Makes sense. Does treating the empty presentation type as special this
    way add much extra complication to the implementation?

    @ericvsmith
    Copy link
    Member

    Mark Dickinson wrote:

    > Although I guess if we wanted to, we could say that the empty
    > presentation type is equivalent to 'g', but gives you parens.

    This works for me.

    Me, too.

    [about suppressing real zeros...]
    > Again, we could say that the empty presentation type is
    > different in this regard.

    Makes sense. Does treating the empty presentation type as special this
    way add much extra complication to the implementation?

    No. I'm basically finished with it. Before I check it in, I'll attach a
    patch (against trunk) so you can look at how it works.

    @ericvsmith
    Copy link
    Member

    See the attached patch. Comments welcome.

    I'm not sure I'm doing the right thing with 'g' and appending zeros:
    >>> format(3+4j, '.2')
    '(3+4j)'
    >>> format(3+4j, '.2g')
    '3.0+4.0j'
    >>> format(3+0j, '.2')
    '(3+0j)'
    >>> format(3+0j, '.2g')
    '3.0+0.0j'
    >>> format(1j, '.2')
    '(1j)'
    >>> format(1j, '.2g')
    '0.0+1.0j'

    @mdickinson
    Copy link
    Member

    I'll take a look.

    The trailing zeros thing is heavily bound up with bpo-5858, of course;
    I think we need a decision on that, one way or the other. One problem
    is that I'm only proposing the bpo-5858 change for py3k, not trunk.

    @ericvsmith
    Copy link
    Member

    Mark Dickinson wrote:

    The trailing zeros thing is heavily bound up with bpo-5858, of course;
    I think we need a decision on that, one way or the other. One problem
    is that I'm only proposing the bpo-5858 change for py3k, not trunk.

    I don't have a problem with trunk's complex.__format__ not agreeing with
    trunk's complex.__str__. If it's a big deal, we can just take
    complex.__format__ completely out of trunk with a #define. In any event,
    there's lots of time before 2.7 and not so much before 3.1, so let's
    concentrate on trunk. Which is what I should have done with starting
    this issue (but forward porting is easier for me that back porting).

    @ericvsmith
    Copy link
    Member

    Here's a patch against py3k, with one slight change with non-empty
    presentation types.

    @mdickinson
    Copy link
    Member

    With your patch, I'm getting quite strange results when using alignment
    specifiers:

    >>> z = 123+4j
    >>> format(z, '=20')
    '(                 123+                  4j)'
    >>> format(z, '^20')
    '(        123                  +4         j)'
    >>> format(z, '<20')
    '(123                 +4                  j)'
    >>> len(format(z, '<20'))
    43

    Is this intentional? I was expecting to get strings of length 20,
    with the substring '(123+4j)' positioned either in the middle
    or on the left or right.

    @ericvsmith
    Copy link
    Member

    ...

    Is this intentional? I was expecting to get strings of length 20,
    with the substring '(123+4j)' positioned either in the middle
    or on the left or right.

    No, not intentional. I'll fix it and add some tests. Thanks.

    @ericvsmith
    Copy link
    Member

    This is a patch against py3k, including tests in test_complex.py. It
    should deal with the padding, but let me know.

    @ericvsmith
    Copy link
    Member

    I also propose to disallow the '=' alignment flag. It means put the
    padding between the sign and the digits, and since there are 2 signs,
    it's not clear what this would mean.

    Remember, by using .real and .imag, you can achieve this level of
    control in the formatting, anyway.

    @ericvsmith
    Copy link
    Member

    I think these patches are complete. One for py3k, one for trunk. If no
    complaints, I'll apply them before this weekend's py3k beta.

    @mdickinson
    Copy link
    Member

    With these patches, all tests pass for me both for py3k and trunk.

    @mdickinson
    Copy link
    Member

    I haven't done as thorough a review as I'd like, but both
    patches look good to me. I recommend applying them.

    @ericvsmith
    Copy link
    Member

    Thanks, Mark.

    I'm not so worried about the code, but more so the tests. As far as the
    code goes, it's really a combination of float and string formatting. I
    copied the float formatting and refactored the string formatting so I
    could reuse it.

    But of course, another set of eyes to review it is always welcome. If
    you find anything, I'll fix it. I'm closing this issue.

    Committed in trunk in r72137, and in py3k in r72140.

    @mdickinson
    Copy link
    Member

    One comment on the new complex formatting. I now get (in py3k)

    >>> from math import pi, e
    >>> format(complex(pi,e), '<')
    '(3.14159+2.71828j)'
    >>> format(complex(pi,e), '')
    '(3.14159265359+2.71828182846j)'

    I understand why this is happening, but again I think that alignment
    flags shouldn't change the form of the number itself. Would it be
    reasonable to have the empty format code always use a precision of 12?

    @ericvsmith
    Copy link
    Member

    Is this suggestion for all types, or just complex? Because float has the
    same issue.

    >>> format(pi, '')
    '3.14159265359'
    [38243 refs]
    >>> format(pi, '>')
    '3.14159'

    @mdickinson
    Copy link
    Member

    Hmm. That also seems wrong to me. So I guess it's a suggestion
    for float as well, which means it's not specific to this issue.
    Should I open a separate feature request?

    @ericvsmith
    Copy link
    Member

    Hmm. That also seems wrong to me. So I guess it's a suggestion
    for float as well, which means it's not specific to this issue.
    Should I open a separate feature request?

    Yes, this is a separate issue. It comes from PEP-3101's specification of
    "like 'g' but different" for floats with no specified presentation type.

    @mdickinson
    Copy link
    Member

    Yes, this is a separate issue.

    Thanks. See bpo-5920.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants