str.format() wrongly formats complex() numbers (Py30a2) #45929

mark-summerfield · 2007-12-11T13:30:52Z

BPO	1588
Nosy	@gvanrossum, @mdickinson, @ericvsmith, @devdanzin, @mark-summerfield
Files	issue-1588-trunk.patch issue-1588-py3k.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/ericvsmith'
closed_at = <Date 2009-04-30.01:01:40.337>
created_at = <Date 2007-12-11.13:30:51.526>
labels = ['interpreter-core', 'type-feature']
title = 'str.format() wrongly formats complex() numbers (Py30a2)'
updated_at = <Date 2009-05-04.11:39:04.184>
user = 'https://github.com/mark-summerfield'

bugs.python.org fields:

activity = <Date 2009-05-04.11:39:04.184>
actor = 'mark.dickinson'
assignee = 'eric.smith'
closed = True
closed_date = <Date 2009-04-30.01:01:40.337>
closer = 'eric.smith'
components = ['Interpreter Core']
creation = <Date 2007-12-11.13:30:51.526>
creator = 'mark'
dependencies = []
files = ['13807', '13808']
hgrepos = []
issue_num = 1588
keywords = ['patch']
message_count = 39.0
messages = ['58428', '58447', '58448', '58483', '58484', '58496', '86640', '86651', '86652', '86656', '86679', '86680', '86681', '86682', '86683', '86686', '86717', '86718', '86719', '86720', '86721', '86722', '86724', '86725', '86726', '86727', '86731', '86732', '86755', '86766', '86772', '86827', '86829', '86836', '86964', '86972', '86973', '86976', '87116']
nosy_count = 5.0
nosy_names = ['gvanrossum', 'mark.dickinson', 'eric.smith', 'ajaksu2', 'mark']
pr_nums = []
priority = 'normal'
resolution = 'accepted'
stage = 'patch review'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue1588'
versions = ['Python 3.1', 'Python 2.7']

mark-summerfield · 2007-12-11T13:30:51Z

>>> x = complex(1, 2/3)
>>> "{0} {0:.5}".format(x)
'(1+0.666666666667j) (1+0.'

The complex number is being formatted as if it were a string and simply
truncated to 5 characters. I would expect each part of the complex
number to be formatted according to the format specifier, i.e., in the
case of :.5 to both have 5 digits after the decimal point.

gvanrossum · 2007-12-11T17:56:41Z

This really is a feature request -- in Python 2.x there is no formatting
code for complex numbers at all, and "%.5s" % complex(...) does the same
thing.

I agree it would be neat to have control over complex numbers using the
same formatting language used for floats; but I note that it's easy
enough to do this manually, e.g.

>>> "{0.real:.5}+{0.imag:.5}j".format(z)
'1+0.66667j'

gvanrossum · 2007-12-11T17:57:10Z

Maybe this would be a good GHOP task?

mark-summerfield · 2007-12-12T07:42:45Z

On 2007-12-11, Guido van Rossum wrote:

Guido van Rossum added the comment:

This really is a feature request -- in Python 2.x there is no formatting
code for complex numbers at all, and "%.5s" % complex(...) does the same
thing.

I thought Python 3 was meant to be an _improvement_:-)

I agree it would be neat to have control over complex numbers using the
same formatting language used for floats; but I note that it's easy
enough to do this manually, e.g.

>>> "{0.real:.5}+{0.imag:.5}j".format(z)

'1+0.66667j'

Good point, I'll use that.

Thanks!

mark-summerfield · 2007-12-12T08:22:15Z

On 2007-12-11, Guido van Rossum wrote:
> Guido van Rossum added the comment:
>
> This really is a feature request -- in Python 2.x there is no formatting
> code for complex numbers at all, and "%.5s" % complex(...) does the same
> thing.
>
> I agree it would be neat to have control over complex numbers using the
> same formatting language used for floats; but I note that it's easy
> enough to do this manually, e.g.
>
> >>> "{0.real:.5}+{0.imag:.5}j".format(z)
>
> '1+0.66667j'

That's not quite right because it doesn't always handle the sign
correctly and doesn't force float output. So I think it should be this:

'1.00000+0.66667j'
>>> "{0.real:.5f}{0.imag:+.5f}j".format(complex(1, -2/3))
'1.00000-0.66667j'

gvanrossum · 2007-12-12T15:11:06Z

I thought Python 3 was meant to be an _improvement_:-)

That's why I didn't close the issue but reclassified it.

Or did you expect me to implement it overnight? :-)

devdanzin · 2009-04-27T01:29:57Z

Confirmed in py3k at rev71995.

ericvsmith · 2009-04-27T09:10:14Z

I agree this is a feature request. It comes down to:

What should the format specifier mini-language for complex numbers look
like?

Should it look like the existing mini-language for floats, but have the
format specified twice, with some sort of delimiter? Or just specified
once, and use that for both parts?

I'm sure python-ideas could argue over it for ages, but I don't see any
outcome that's much of an improvement over the suggested:
"{0.real:.5f}{0.imag:+.5f}j".format(complex(1, -2/3))

mdickinson · 2009-04-27T09:26:20Z

What should the format specifier mini-language for complex numbers look
like?
Should it look like the existing mini-language for floats, but have
the format specified twice, with some sort of delimiter?

This sounds clumsy to me. I'd guess that in most uses you'd want the
same format for both pieces.

Or just specified once, and use that for both parts?

That doesn't sound unreasonable. But there might need to be some
thinking about exactly what a '+' modifier means, or how you pad with
zeros on the left when you've got two pieces to pad.

It seems simplest just to tell people to format the real and imaginary
parts by hand. As it isn't totally obvious how to do this (e.g.,
remembering the '+' for the imaginary part), perhaps there should be a
recipe in the docs somewhere?

ericvsmith · 2009-04-27T11:32:07Z

Mark Dickinson wrote:

> What should the format specifier mini-language for complex numbers look
> like?
> Should it look like the existing mini-language for floats, but have
> the format specified twice, with some sort of delimiter?

This sounds clumsy to me. I'd guess that in most uses you'd want the
same format for both pieces.

I agree, and mostly I was just trying to spark some discussion and show
how (absurdly) far we can take this.

> Or just specified once, and use that for both parts?

That doesn't sound unreasonable. But there might need to be some
thinking about exactly what a '+' modifier means, or how you pad with
zeros on the left when you've got two pieces to pad.

How about this:

we have a single specifier with the same format as floats
we force the sign on the imaginary part to be '+', no
matter what was specified
we add a 'j' after the imaginary part
we ignore any width specified (and therefor any alignment
and padding)

It seems simplest just to tell people to format the real and imaginary
parts by hand. As it isn't totally obvious how to do this (e.g.,
remembering the '+' for the imaginary part), perhaps there should be a
recipe in the docs somewhere?

When we document the above approach, we note the way to get full control
as mentioned in a prior message.

I guess we should put the docs in with string formatting (since that's
where the other builtin types are documented), although really it
belongs in complex.__format__ by itself. But I doubt anyone would find
it there. Maybe we could to add a pointer from the string formatting to
complex.__format__.

mdickinson · 2009-04-27T16:51:51Z

How about this:

we have a single specifier with the same format as floats

we force the sign on the imaginary part to be '+', no
matter what was specified

we add a 'j' after the imaginary part

This sounds good to me. I assume a '+' would still affect
the sign of the real part?

we ignore any width specified (and therefor any alignment
and padding)

I don't see any problem with dealing with width, alignment
and padding with a user-specified fill character; I think we
should keep these if possible. It's just zero padding where
it's not clear what should happen.

For the bits that are disabled (e.g., zero padding), should
there be a ValueError raised, or do those bits just get
silently ignored?

ericvsmith · 2009-04-27T16:58:30Z

I don't see any problem with dealing with width, alignment
and padding with a user-specified fill character; I think we
should keep these if possible. It's just zero padding where
it's not clear what should happen.

You're correct. It's just zero padding that would be disabled.

For the bits that are disabled (e.g., zero padding), should
there be a ValueError raised, or do those bits just get
silently ignored?

I think a ValueError would be best. That way if we decide to give it some
meaning in the future, we know it won't change any working code.

mdickinson · 2009-04-27T17:01:43Z

More specifically, how about allowing widths, and the
'<', '>' and '^' alignment specifiers, but not '=', or
'0' for zero-padding.

I suppose that thousands separators should be permitted
here too? Though it's difficult to imagine anyone actually
using them. If we allow ',' but not '0' then we avoid
the crazy zero-padding--thousands-separators interaction.

ericvsmith · 2009-04-27T17:03:18Z

This sounds good to me. I assume a '+' would still affect
the sign of the real part?

Forgot to reply to this part.

Yes, a '+', '-', or ' ' would still affect the real part, but the
imaginary part would always use '+'.

mdickinson · 2009-04-27T17:04:14Z

I think a ValueError would be best. That way if we decide to give it
some meaning in the future, we know it won't change any working code.

Agreed. It also fits with the way that other non-numeric types seem to
behave, as in:

>>> format("boris", "030s")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: '=' alignment not allowed in string format specifier

ericvsmith · 2009-04-27T17:06:59Z

More specifically, how about allowing widths, and the
'<', '>' and '^' alignment specifiers, but not '=', or
'0' for zero-padding.

That sounds correct.

I suppose that thousands separators should be permitted
here too? Though it's difficult to imagine anyone actually
using them. If we allow ',' but not '0' then we avoid
the crazy zero-padding--thousands-separators interaction.

That was my thinking, too.

ericvsmith · 2009-04-28T09:23:35Z

I'm also going to disallow the '%' format code. I don't think it makes
any sense to convert a complex number to a percentage.

mdickinson · 2009-04-28T09:32:19Z

I'm also going to disallow the '%' format code.

Sounds good to me.

I don't think it makes any sense to convert a complex number to a
percentage.

Well, I think it's clear what the numbers would be (just scale both real
and imaginary parts by 100 before using fixed-point formatting). The
real issue whether to have two trailing '%'s or one.

Just being difficult: I completely agree that '%' should be disallowed
for complex numbers.

mdickinson · 2009-04-28T09:39:38Z

Two things that haven't come up so far:

(1) What about parentheses? The current complex repr and str have
parentheses in them, for reasons that I still don't really understand.

I'd suggest leaving them out altogether; except that I have
the impression (perhaps wrongly) that an empty type code is
supposed to correspond to str. And given that I don't understand
why the parens were in there in the first place, I'm probably
not a good person to judge whether they should stay in a
formatted complex number.

(2) What about zeros? The current repr and str leave out the real
part (and the enclosing parens) if it's equal to zero. Should
format do the same? I'd say not, except possibly again in the
case where there's no type code.

ericvsmith · 2009-04-28T10:01:57Z

Mark Dickinson wrote:

(1) What about parentheses? The current complex repr and str have
parentheses in them, for reasons that I still don't really understand.

I'd suggest leaving them out altogether; except that I have
the impression (perhaps wrongly) that an empty type code is
supposed to correspond to str. And given that I don't understand
why the parens were in there in the first place, I'm probably
not a good person to judge whether they should stay in a
formatted complex number.

The rule is that if that x.__format__('') is equivalent to str(x). All
of the built-in objects have a test for a zero-length format string and
delegate to str(). But (3).__format__('-') does not call str(), despite
the fact that it's the identical output. That's because the format
string isn't zero-length. Instead, this is the case of the missing
format "presentation type".

I couldn't find a case with any built-in objects where this really makes
a difference (although I can't say I spent a lot of time at it). Complex
would be the first one. But that doesn't really bother me.

format(1+1j, '') -> '(1+1j)'
format(1+1j, '-') -> '1+1j'

Although I guess if we wanted to, we could say that the empty
presentation type is equivalent to 'g', but gives you parens. This would
fit in nicely with bpo-5858, if it's accepted. Floats do something
similar and special case the empty presentation type: '' is like 'g',
but with at least one digit after the decimal point.

(2) What about zeros? The current repr and str leave out the real
part (and the enclosing parens) if it's equal to zero. Should
format do the same? I'd say not, except possibly again in the
case where there's no type code.

I agree. Again, we could say that the empty presentation type is
different in this regard.

mdickinson · 2009-04-28T10:26:39Z

Complex would be the first one. But that doesn't really bother me.

It bothers me a little. I see '' as a special case of the empty
presentation type, even if that's not what a strict reading of
PEP-3101 says, so I expect '', '>' '<20' all to format the
number in the same way, and only differ in their treatment of
alignment and padding. That is, adding a '>' to the start of a
format specifier shouldn't change the formatting of the number
itself. So from this perspective, it seems better if format(x, '')
ends up doing the same thing as str(x) as a result of the
choices made for the empty presentation type, rather than
as a result of special-casing ''.

Although I guess if we wanted to, we could say that the empty
presentation type is equivalent to 'g', but gives you parens.

This works for me.

[about suppressing real zeros...]

Again, we could say that the empty presentation type is
different in this regard.

Makes sense. Does treating the empty presentation type as special this
way add much extra complication to the implementation?

ericvsmith · 2009-04-28T10:34:09Z

Mark Dickinson wrote:

> Although I guess if we wanted to, we could say that the empty
> presentation type is equivalent to 'g', but gives you parens.

This works for me.

Me, too.

[about suppressing real zeros...]
> Again, we could say that the empty presentation type is
> different in this regard.

Makes sense. Does treating the empty presentation type as special this
way add much extra complication to the implementation?

No. I'm basically finished with it. Before I check it in, I'll attach a
patch (against trunk) so you can look at how it works.

ericvsmith · 2009-04-28T10:56:11Z

See the attached patch. Comments welcome.

I'm not sure I'm doing the right thing with 'g' and appending zeros:
>>> format(3+4j, '.2')
'(3+4j)'
>>> format(3+4j, '.2g')
'3.0+4.0j'
>>> format(3+0j, '.2')
'(3+0j)'
>>> format(3+0j, '.2g')
'3.0+0.0j'
>>> format(1j, '.2')
'(1j)'
>>> format(1j, '.2g')
'0.0+1.0j'

mdickinson · 2009-04-28T11:06:39Z

I'll take a look.

The trailing zeros thing is heavily bound up with bpo-5858, of course;
I think we need a decision on that, one way or the other. One problem
is that I'm only proposing the bpo-5858 change for py3k, not trunk.

ericvsmith · 2009-04-28T11:15:43Z

Mark Dickinson wrote:

The trailing zeros thing is heavily bound up with bpo-5858, of course;
I think we need a decision on that, one way or the other. One problem
is that I'm only proposing the bpo-5858 change for py3k, not trunk.

I don't have a problem with trunk's complex.__format__ not agreeing with
trunk's complex.__str__. If it's a big deal, we can just take
complex.__format__ completely out of trunk with a #define. In any event,
there's lots of time before 2.7 and not so much before 3.1, so let's
concentrate on trunk. Which is what I should have done with starting
this issue (but forward porting is easier for me that back porting).

ericvsmith · 2009-04-28T11:36:39Z

Here's a patch against py3k, with one slight change with non-empty
presentation types.

mdickinson · 2009-04-28T12:49:04Z

With your patch, I'm getting quite strange results when using alignment
specifiers:

>>> z = 123+4j
>>> format(z, '=20')
'(                 123+                  4j)'
>>> format(z, '^20')
'(        123                  +4         j)'
>>> format(z, '<20')
'(123                 +4                  j)'
>>> len(format(z, '<20'))
43

Is this intentional? I was expecting to get strings of length 20,
with the substring '(123+4j)' positioned either in the middle
or on the left or right.

ericvsmith · 2009-04-28T12:51:48Z

...

Is this intentional? I was expecting to get strings of length 20,
with the substring '(123+4j)' positioned either in the middle
or on the left or right.

No, not intentional. I'll fix it and add some tests. Thanks.

ericvsmith · 2009-04-28T17:33:45Z

This is a patch against py3k, including tests in test_complex.py. It
should deal with the padding, but let me know.

ericvsmith · 2009-04-28T20:37:58Z

I also propose to disallow the '=' alignment flag. It means put the
padding between the sign and the digits, and since there are 2 signs,
it's not clear what this would mean.

Remember, by using .real and .imag, you can achieve this level of
control in the formatting, anyway.

ericvsmith · 2009-04-28T22:47:34Z

I think these patches are complete. One for py3k, one for trunk. If no
complaints, I'll apply them before this weekend's py3k beta.

mdickinson · 2009-04-29T22:05:36Z

With these patches, all tests pass for me both for py3k and trunk.

mdickinson · 2009-04-29T22:11:37Z

I haven't done as thorough a review as I'd like, but both
patches look good to me. I recommend applying them.

ericvsmith · 2009-04-30T01:01:39Z

Thanks, Mark.

I'm not so worried about the code, but more so the tests. As far as the
code goes, it's really a combination of float and string formatting. I
copied the float formatting and refactored the string formatting so I
could reuse it.

But of course, another set of eyes to review it is always welcome. If
you find anything, I'll fix it. I'm closing this issue.

Committed in trunk in r72137, and in py3k in r72140.

mdickinson · 2009-05-02T18:34:48Z

One comment on the new complex formatting. I now get (in py3k)

>>> from math import pi, e
>>> format(complex(pi,e), '<')
'(3.14159+2.71828j)'
>>> format(complex(pi,e), '')
'(3.14159265359+2.71828182846j)'

I understand why this is happening, but again I think that alignment
flags shouldn't change the form of the number itself. Would it be
reasonable to have the empty format code always use a precision of 12?

ericvsmith · 2009-05-02T19:33:23Z

Is this suggestion for all types, or just complex? Because float has the
same issue.

>>> format(pi, '')
'3.14159265359'
[38243 refs]
>>> format(pi, '>')
'3.14159'

mdickinson · 2009-05-02T19:37:45Z

Hmm. That also seems wrong to me. So I guess it's a suggestion
for float as well, which means it's not specific to this issue.
Should I open a separate feature request?

ericvsmith · 2009-05-02T19:57:52Z

Hmm. That also seems wrong to me. So I guess it's a suggestion
for float as well, which means it's not specific to this issue.
Should I open a separate feature request?

Yes, this is a separate issue. It comes from PEP-3101's specification of
"like 'g' but different" for floats with no specified presentation type.

mdickinson · 2009-05-04T11:39:04Z

Yes, this is a separate issue.

Thanks. See bpo-5920.

mark-summerfield mannequin added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 11, 2007

ericvsmith added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Apr 27, 2009

ericvsmith self-assigned this Apr 27, 2009

ericvsmith closed this as completed Apr 30, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

str.format() wrongly formats complex() numbers (Py30a2) #45929

str.format() wrongly formats complex() numbers (Py30a2) #45929

Comments

mark-summerfield mannequin commented Dec 11, 2007

mark-summerfield mannequin commented Dec 11, 2007

gvanrossum commented Dec 11, 2007

gvanrossum commented Dec 11, 2007

mark-summerfield mannequin commented Dec 12, 2007

mark-summerfield mannequin commented Dec 12, 2007

gvanrossum commented Dec 12, 2007

devdanzin mannequin commented Apr 27, 2009

ericvsmith commented Apr 27, 2009

mdickinson commented Apr 27, 2009

ericvsmith commented Apr 27, 2009

mdickinson commented Apr 27, 2009

ericvsmith commented Apr 27, 2009

mdickinson commented Apr 27, 2009

ericvsmith commented Apr 27, 2009

mdickinson commented Apr 27, 2009

ericvsmith commented Apr 27, 2009

ericvsmith commented Apr 28, 2009

mdickinson commented Apr 28, 2009

mdickinson commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

mdickinson commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

mdickinson commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

mdickinson commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

ericvsmith commented Apr 28, 2009

mdickinson commented Apr 29, 2009

mdickinson commented Apr 29, 2009

ericvsmith commented Apr 30, 2009

mdickinson commented May 2, 2009

ericvsmith commented May 2, 2009

mdickinson commented May 2, 2009

ericvsmith commented May 2, 2009

mdickinson commented May 4, 2009