complex constructor doesn't accept string with nan and inf #46374

tiran · 2008-02-15T08:15:16Z

BPO	2121
Nosy	@rhettinger, @mdickinson, @tiran
Files	nan_parse.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/mdickinson'
closed_at = <Date 2009-01-20.21:40:12.738>
created_at = <Date 2008-02-15.08:15:16.030>
labels = ['interpreter-core', 'type-feature']
title = "complex constructor doesn't accept string with nan and inf"
updated_at = <Date 2009-01-20.21:40:12.736>
user = 'https://github.com/tiran'

bugs.python.org fields:

activity = <Date 2009-01-20.21:40:12.736>
actor = 'mark.dickinson'
assignee = 'mark.dickinson'
closed = True
closed_date = <Date 2009-01-20.21:40:12.738>
closer = 'mark.dickinson'
components = ['Interpreter Core']
creation = <Date 2008-02-15.08:15:16.030>
creator = 'christian.heimes'
dependencies = []
files = ['12504']
hgrepos = []
issue_num = 2121
keywords = ['patch']
message_count = 21.0
messages = ['62422', '63169', '78592', '78593', '78594', '78595', '78596', '78597', '78598', '78824', '78827', '78833', '79250', '79251', '79257', '79261', '79266', '79268', '79272', '79273', '80284']
nosy_count = 4.0
nosy_names = ['rhettinger', 'mark.dickinson', 'christian.heimes', 'cdavid']
pr_nums = []
priority = 'low'
resolution = 'wont fix'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue2121'
versions = ['Python 3.1', 'Python 2.7']

tiran · 2008-02-15T08:15:15Z

This is a reminder.

The issue makes the complex(repr(...)) round-trip impossible when either
the real or imag part is nan or infinite.

mdickinson · 2008-03-01T21:38:48Z

Signed zeros cause difficulties with round-tripping, too:

>>> z = complex(-0.0, -0.0); w = complex(repr(z))
>>> z
-0j
>>> w
-0j
>>> z.real
-0.0
>>> w.real
0.0

cdavid · 2008-12-31T07:41:50Z

I started a patch against the trunk to handle nan/inf/infinite (I have
not yet tackled the problem of negative zero).

The patch is a bit big, because I found the function quite difficult to
follow, so I refactored it a bit first (replacing the state machine with
the big switch with a sequential parsing). One potential problem is that
I do some computation with inf to make signed infinite, I don't know if
it is safe. I don't know is setting a variable to Nan is safe either.

Although I tested the function in a simple main under valgrind for
various legit and bogus input, there is no tests in this patch. I would
like to add some, but I am not familiar with python testing, so I would
be glad to receive some indications on how to do it properly.

cdavid · 2008-12-31T07:53:18Z

Of course, I notice two bugs just after sending the patch... New patch
to fix those two issues (no check for closing bracket if opening ones
are there and a bug when only imaginary part is given).

rhettinger · 2008-12-31T09:05:25Z

-1 on this feature request. IMO, it adds no value to any real world
applications and isn't worth the added code complexity, the on-going
maintenance burden, and the risk of introducing new bugs.

cdavid · 2008-12-31T09:14:00Z

I disagree the feature is not needed, for several reasons:

complex(repr(..)) roundtrip should work, whatever the value of complex is
it is supported for float, so why not for complex ?
I believe on the contrary it solves a very real problem: incidently,
I got interested in this patch while working on numpy, and it is
certainly useful to be able to parse nan and inf (for example when we
create arrays from strings). Nan may be seen as non useful for so called
real usage of python, but for scientific people, it is a crucial to have
proper support of nan (which may mean missing data depending on the
context) and inf.
it does not add complexity: I would argue that independantly of
nan/inf support, my patch makes the function simpler to follow (no state
machine with done/sw_error/etc... state).

rhettinger · 2008-12-31T10:05:15Z

complex(repr(..)) roundtrip should work,

Nice-to-have but not a requirement that the entire input domain be
supported.

it is supported for float, so why not for complex ?

It made sense for floats because of prevalence of use cases and because
we wanted to match IEEE-754R as much as possible. Unfortunately, the
support for Infs and NaNs has already negatively impacted the float
world by making the math module harder to maintain and test. I would
not like to see that extended to cmath or complex() unless compelling
real-world use cases arise.

... for scientific people, it is a crucial to have proper support
of nan (which may mean missing data depending on the context) and inf.

Mark, does Inf have a standard interpretation for complex numbers? Do
all infinities meet or do they radiate, each with their own phase angle?

Also, do you want to stick with the 754 interpretation of NaNs as the
result of invalid operations or are you comfortable with the
MatLab/Octave notion of using NaNs to indicate missing values (something
they do as an optimization because it is the only way to flag a
non-numeric value in a floating point register or C double)?

it does not add complexity:

That depends on whether handling of NaNs and Infs creeps into cmath.

Mainly, I'm just questioning whether there exist compelling use cases
for parsing NaNs and Infs in the context of complex numbers.

cdavid · 2008-12-31T10:38:23Z

Nice-to-have but not a requirement that the entire input domain be
supported.

Ok.

It made sense for floats because of prevalence of use cases and
because we wanted to match IEEE-754R as much as possible.

But why shouldn't this apply to complex numbers ? I am biased because I
mainly use python for scientific work, but complex numbers are not more
special than float in my mind.

Unfortunately, the support for Infs and NaNs has already negatively
impacted the float world by making the math module harder to maintain
and test.

Yes, it is difficult to handle nan and inf, there are a lot of corner
cases. But I fail to see how this applies here: my patch is essentially
a rewrote of the parsing, and the code to handle nan/inf is only 7
lines. This is independent of the handling of how nan/inf is handled by
math operations on it.

Mark, does Inf have a standard interpretation for complex numbers?
Do all infinities meet or do they radiate, each with their own phase >
angle?

Hm, not sure what you mean here by standard interpretation, but
infinities "do not meet", in the sense that (x,inf) and (inf,x) for
example can never been 'near' from each other (in the distance sense),
except if x is inf. It does not make more sense to talk about phase of
complex numbers with inf than for 0. But again, I don't see how this is
relevant to the proposed feature.

Mainly, I'm just questioning whether there exist compelling use cases
for parsing NaNs and Infs in the context of complex numbers.

For any task where parsing complex makes sense. Since many numerical
operations may end up with nan or inf, this is relatively common I would
say.

It this can make the review easier, I can split the patch in two: first
the refactoring (+ tests once someone tells me how to), and then
inf/nan handling (with additional tests).

cdavid · 2008-12-31T11:34:53Z

Ok, I found out how to make tests, and I found some problems while using
this on numpy. A third version of the patch, with unit tests: all tests
in test_complex.py pass.

mdickinson · 2009-01-02T15:57:20Z

[Raymond]

I would
not like to see that extended to cmath or complex() unless compelling
real-world use cases arise.

Hmm. Have you looked at the cmath module recently? You may be in for a
nasty surprise...

Mark, does Inf have a standard interpretation for complex numbers? Do
all infinities meet or do they radiate, each with their own phase
angle?

Shrug. Mathematically, by far the most common and useful model is the
complex plane plus a single extra point at infinity. But when complexes
are implemented as pairs of floats things look a little strange. Kahan,
in his 'branch cuts' paper identifies 9 'different' complex infinities
in this model, and C99 Annex G specifies exactly how functions should
behave for various different infinities.

It's never really been clear to me how useful it is to be able to
distinguish these infinities. But the cmath module *does* make an
effort to return the 'correct' (according to C99 Annex G) values for
inputs with a special real or imaginary component. On balance, I'd
support making complex('nan + nan*j') do the right thing.

(Though I can't help feeling that the repr of complex(nan, nan)
should have been 'nan + nan*1j' rather than the current 'nan + nan*j'.)

Also, do you want to stick with the 754 interpretation of NaNs as the
result of invalid operations

I've never much liked the use of NaN to represent missing values, and
I don't think Python should embrace that usage.

mdickinson · 2009-01-02T16:08:18Z

cdavid, in your application, how hard is it to work around the problem by
simply printing and parsing pairs of floats rather than complex numbers?

E.g.,

<get real part and imaginary part from string>
z = complex(float(real_part), float(imag_part))

cdavid · 2009-01-02T16:26:11Z

It is not really for an application, but for numpy. Of course, one can
always get around the problem - but since this is really a limitation
which can be easily fixed, why not fixing the problem :) ?

mdickinson · 2009-01-06T11:56:27Z

[Mark]

inputs with a special real or imaginary component. On balance, I'd
support making complex('nan + nan*j') do the right thing.

Having thought about this a bit more, I take this back. Some reasons are
given below.

[David]

complex(repr(..)) roundtrip should work, whatever the value of complex
is

I disagree. *Why* do you think it should work? It fails for many other
types:

>>> def roundtrip(x): return type(x)(repr(x))
... 
>>> roundtrip("abc")
"'abc'"
>>> roundtrip([1,2,3])
['[', '1', ',', ' ', '2', ',', ' ', '3', ']']
>>> roundtrip((1,))
('(', '1', ',', ')')
>>> roundtrip(Fraction(1, 2))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in roundtrip
  File "/Users/dickinsm/python_source/trunk/Lib/fractions.py", line 73, in 
__new__
    raise ValueError('Invalid literal for Fraction: %r' % input)
ValueError: Invalid literal for Fraction: 'Fraction(1, 2)'
>>> roundtrip(Decimal(1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in roundtrip
  File "/Users/dickinsm/python_source/trunk/Lib/decimal.py", line 545, in 
__new__
    "Invalid literal for Decimal: %r" % value)
  File "/Users/dickinsm/python_source/trunk/Lib/decimal.py", line 3721, in 
_raise_error
    raise error(explanation)
decimal.InvalidOperation: Invalid literal for Decimal: "Decimal('1')"

In general, type(repr(x)) == x only works for simple types (ints, floats),
not compound types like lists, strings, tuples, ... I think it's
reasonable to regard complex as such a compound type.

There *is* a general Python guideline that eval(repr(x)) == x should work
where possible. Note that this fails even for floats:

>>> eval(repr(float('nan')))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'nan' is not defined

The *real* problem here is that repr(complex) is a mess when it comes to
niceties like signed zeros, infinities and nans:

>>> complex(-0., 0.)  # throws away sign of 0 on real part
0j
>>> eval(repr(complex(2., -0.)))  # throws away sign on imag. part
(2+0j)
>>> nan = float('nan')
>>> complex(0, nan)
nan*j
>>> nan*j  # can't even eval this...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'j' is not defined
>>> nan*1j # not much better: real part should be 0, not nan
(nan+nan*j)
>>> inf = float('inf')
>>> inf*1j  # ouch: nan in real part!
(nan+inf*j)

I think the *right* solution is to define repr of a complex number
z to be:

"complex(%r, %r)" % (z.real, z.imag)

but that's not possible before the mythical, compatibility-breaking,
Python 4.0. And even then your desired complex(repr(z)) roundtripping
wouldn't work, though eval(repr(z)) would.

While we're daydreaming, another possibility is to follow C99's lead, and
introduce a type of 'imaginary' numbers, and make 1j be an imaginary
literal rather than a complex literal. That solves some of the above
eval(repr(.)) problems. Again, I don't see how this could happen before
4.0, and I'm not even sure that it's desirable.

In the circumstances, repr does the best it can, which is to output a
whole expression which more-or-less describes the given complex number,
and will usually eval back to the right thing. I think it's not the
business of the complex() constructor to parse such expressions.

To summarize: it's a mess. I'd recommend that you and/or numpy don't use
repr for roundtrip-capable conversions to string. Instead, output the
real and imaginary parts separately, and use float-from-string and the
complex-from-pair-of-floats constructors to reconstruct.

mdickinson · 2009-01-06T12:05:02Z

Christian, any comments? Is it okay to close this as a 'won't fix'?

cdavid · 2009-01-06T13:16:42Z

I disagree. *Why* do you think it should work? It fails for many other
types:

I don't understand the rationale: why not making something work better
if possible ? Also, I don't understand the comparison with Decimal or
Fraction; there is a big difference between making the roundtrip not
possible at all for a type, and making it possible only for some values.

I think the *right* solution is to define repr of a complex number
z to be:

Why ? I don't understand those examples: maybe I am missing your
argument, but I understand it as we should not fix one bug because there
are another bugs related to inf and nan in complex. For me, something like:

a = complex(0)
a *= float('inf') * 1j
-> a = nan+nanj

Is really not desirable. That's actually a more serious bug than this
one. Are you saying that python does not care about inf/nan for complex
numbers ? If so, be it, but it would a bit unfortunate.

tiran · 2009-01-06T13:30:07Z

Definitely! I don't see a reason the make the code more complex.

David, we are against the patch because we have to keep the code
maintainable. The code is already long and spreads over lots of lines.
Your use case isn't common enough to warrant even more complicated
parsing code.

mdickinson · 2009-01-06T15:44:29Z

As Christian says, it's about not increasing code complexity without a
good reason. For me, it's also about mental complexity: the set of valid
inputs to the complex constructor should be small, and should be easy to
understand and to describe.

The addition of the '*1j' part of the expression bothers me particularly,
as does allowing complex('j') to be valid.

I have a compromise proposal:

For nans and infs, get rid of the * on output: change repr so that
e.g., repr(complex(0, nan)) is simply 'nanj', and repr(complex(inf,-inf))
is '(inf-infj)'. (Yes, I know it looks ugly, but bear with me.)
On input, allow strings of the form "complex-string" or "(complex-
string)" where (in pseudo BNF):

sign ::= '+' | '-'
real-part ::= float-string
imag-part ::= float-string 'j'
complex-string ::= real-part [sign imag-part] | imag-part

and float-string is any string currently accepted by float
(excluding leading or trailing whitespace). Nothing else would be
permitted. (Well, okay, we still have to allow whitespace around the
complex string, both inside and outside the parentheses.)

I think this would simplify the parsing, and remove need for special
casing of nans and infs. It might even allow the code to become simpler,
by sharing some of it with stuff in floatobject.c.

The above would allow double signs: e.g. '2+-1j', with the first sign
being the one explicitly described above and the second being part of the
float-string. I don't think this is a terrible thing, if it simplifies
parsing. It might even be helpful. For example: "%r+%rj" % (z.real,
z.imag) would always be valid input.

The current 'nan*j' is certainly more attractive than 'nanj', but it's
just too suggestive of the actual expression float('nan')*1j, which it
turns out doesn't produce the same thing.

If you could come up with a patch that does something like this I'd take a
look.

By the way, "0 + inf*1j" giving nan + inf*j isn't really a bug; it's
pretty much unavoidable: 1j is really complex(0, 1), and when multiplying
by inf we get complex(inf*0, inf*1). With the usual IEEE 754 semantics
inf*0 is nan.

I think it's exactly to fix this sort of behaviour that C99 introduced its
imaginary type (see Annex G of the standard). It doesn't seem to have
caught on, though: I don't think gcc implements this, and I'd be very
surprised if Visual Studio does.

cdavid · 2009-01-06T16:05:00Z

@ Mark

Concerning float('inf') * 1j: you're right, my rambling did not make any
sense, sorry.

I agree that adding complexity may be a good reason to warrant an
arbitrary feature; actually, I did not manage to handle nan/inf at first
because of the complexity :) But as I mentioned in a previous comment, I
think my patch simplifies the parsing as well. I suggested to split the
patch in to: one whithout any new feature (no handling of nan/inf), and
then anoter handling nan/inf on the top of it. Would that be acceptable ?

Also, how should I proceed to share implementation between floatobject.c
and complexobject.c ? Would creating a new file for the common code be
acceptable ?

rhettinger · 2009-01-06T16:28:37Z

David, at this point, I recommend dropping this one. It has become a
time waster. Possibly, there is a refactoring of the code that makes
parsing a bit simpler or shares some common code but it is not worth
destabilizing the battle-tested code and risking the introduction of
subtle new parsing errors.

rhettinger · 2009-01-06T16:35:43Z

Mark, I'm somewhat uncomfortable with your proposal also. It changes
what the parser will accept and the potential benefits are almost nil.
Also, putting NaNs and Infs in complex numbers is probably not something
that should be pursued. I see no utility in pairs like (NaN, 3.0) or
(-2.0, Inf). The whole request is use case challenged. AFAICT, the
current implementation suffice for real cases and there is no compelling
need to rearrange and redesign this code.

mdickinson · 2009-01-20T21:40:13Z

Closing this one as won't fix.

tiran added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 15, 2008

rhettinger added the type-feature A feature request or enhancement label Dec 31, 2008

rhettinger assigned mdickinson Dec 31, 2008

mdickinson closed this as completed Jan 20, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

complex constructor doesn't accept string with nan and inf #46374

complex constructor doesn't accept string with nan and inf #46374

tiran commented Feb 15, 2008

tiran commented Feb 15, 2008

mdickinson commented Mar 1, 2008

cdavid mannequin commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

rhettinger commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

rhettinger commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

mdickinson commented Jan 2, 2009

mdickinson commented Jan 2, 2009

cdavid mannequin commented Jan 2, 2009

mdickinson commented Jan 6, 2009

mdickinson commented Jan 6, 2009

cdavid mannequin commented Jan 6, 2009

tiran commented Jan 6, 2009

mdickinson commented Jan 6, 2009

cdavid mannequin commented Jan 6, 2009

rhettinger commented Jan 6, 2009

rhettinger commented Jan 6, 2009

mdickinson commented Jan 20, 2009

complex constructor doesn't accept string with nan and inf #46374

complex constructor doesn't accept string with nan and inf #46374

Comments

tiran commented Feb 15, 2008

tiran commented Feb 15, 2008

mdickinson commented Mar 1, 2008

cdavid mannequin commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

rhettinger commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

rhettinger commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

cdavid mannequin commented Dec 31, 2008

mdickinson commented Jan 2, 2009

mdickinson commented Jan 2, 2009

cdavid mannequin commented Jan 2, 2009

mdickinson commented Jan 6, 2009

mdickinson commented Jan 6, 2009

cdavid mannequin commented Jan 6, 2009

tiran commented Jan 6, 2009

mdickinson commented Jan 6, 2009

cdavid mannequin commented Jan 6, 2009

rhettinger commented Jan 6, 2009

rhettinger commented Jan 6, 2009

mdickinson commented Jan 20, 2009