Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complex constructor doesn't accept string with nan and inf #46374

Closed
tiran opened this issue Feb 15, 2008 · 21 comments
Closed

complex constructor doesn't accept string with nan and inf #46374

tiran opened this issue Feb 15, 2008 · 21 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@tiran
Copy link
Member

tiran commented Feb 15, 2008

BPO 2121
Nosy @rhettinger, @mdickinson, @tiran
Files
  • nan_parse.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/mdickinson'
    closed_at = <Date 2009-01-20.21:40:12.738>
    created_at = <Date 2008-02-15.08:15:16.030>
    labels = ['interpreter-core', 'type-feature']
    title = "complex constructor doesn't accept string with nan and inf"
    updated_at = <Date 2009-01-20.21:40:12.736>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2009-01-20.21:40:12.736>
    actor = 'mark.dickinson'
    assignee = 'mark.dickinson'
    closed = True
    closed_date = <Date 2009-01-20.21:40:12.738>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2008-02-15.08:15:16.030>
    creator = 'christian.heimes'
    dependencies = []
    files = ['12504']
    hgrepos = []
    issue_num = 2121
    keywords = ['patch']
    message_count = 21.0
    messages = ['62422', '63169', '78592', '78593', '78594', '78595', '78596', '78597', '78598', '78824', '78827', '78833', '79250', '79251', '79257', '79261', '79266', '79268', '79272', '79273', '80284']
    nosy_count = 4.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'christian.heimes', 'cdavid']
    pr_nums = []
    priority = 'low'
    resolution = 'wont fix'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue2121'
    versions = ['Python 3.1', 'Python 2.7']

    @tiran
    Copy link
    Member Author

    tiran commented Feb 15, 2008

    This is a reminder.

    The issue makes the complex(repr(...)) round-trip impossible when either
    the real or imag part is nan or infinite.

    @tiran tiran added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Feb 15, 2008
    @mdickinson
    Copy link
    Member

    Signed zeros cause difficulties with round-tripping, too:

    >>> z = complex(-0.0, -0.0); w = complex(repr(z))
    >>> z
    -0j
    >>> w
    -0j
    >>> z.real
    -0.0
    >>> w.real
    0.0

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Dec 31, 2008

    I started a patch against the trunk to handle nan/inf/infinite (I have
    not yet tackled the problem of negative zero).

    The patch is a bit big, because I found the function quite difficult to
    follow, so I refactored it a bit first (replacing the state machine with
    the big switch with a sequential parsing). One potential problem is that
    I do some computation with inf to make signed infinite, I don't know if
    it is safe. I don't know is setting a variable to Nan is safe either.

    Although I tested the function in a simple main under valgrind for
    various legit and bogus input, there is no tests in this patch. I would
    like to add some, but I am not familiar with python testing, so I would
    be glad to receive some indications on how to do it properly.

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Dec 31, 2008

    Of course, I notice two bugs just after sending the patch... New patch
    to fix those two issues (no check for closing bracket if opening ones
    are there and a bug when only imaginary part is given).

    @rhettinger
    Copy link
    Contributor

    -1 on this feature request. IMO, it adds no value to any real world
    applications and isn't worth the added code complexity, the on-going
    maintenance burden, and the risk of introducing new bugs.

    @rhettinger rhettinger added the type-feature A feature request or enhancement label Dec 31, 2008
    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Dec 31, 2008

    I disagree the feature is not needed, for several reasons:

    • complex(repr(..)) roundtrip should work, whatever the value of complex is
    • it is supported for float, so why not for complex ?
    • I believe on the contrary it solves a very real problem: incidently,
      I got interested in this patch while working on numpy, and it is
      certainly useful to be able to parse nan and inf (for example when we
      create arrays from strings). Nan may be seen as non useful for so called
      real usage of python, but for scientific people, it is a crucial to have
      proper support of nan (which may mean missing data depending on the
      context) and inf.
    • it does not add complexity: I would argue that independantly of
      nan/inf support, my patch makes the function simpler to follow (no state
      machine with done/sw_error/etc... state).

    @rhettinger
    Copy link
    Contributor

    complex(repr(..)) roundtrip should work,

    Nice-to-have but not a requirement that the entire input domain be
    supported.

    it is supported for float, so why not for complex ?

    It made sense for floats because of prevalence of use cases and because
    we wanted to match IEEE-754R as much as possible. Unfortunately, the
    support for Infs and NaNs has already negatively impacted the float
    world by making the math module harder to maintain and test. I would
    not like to see that extended to cmath or complex() unless compelling
    real-world use cases arise.

    ... for scientific people, it is a crucial to have proper support
    of nan (which may mean missing data depending on the context) and inf.

    Mark, does Inf have a standard interpretation for complex numbers? Do
    all infinities meet or do they radiate, each with their own phase angle?

    Also, do you want to stick with the 754 interpretation of NaNs as the
    result of invalid operations or are you comfortable with the
    MatLab/Octave notion of using NaNs to indicate missing values (something
    they do as an optimization because it is the only way to flag a
    non-numeric value in a floating point register or C double)?

    it does not add complexity:

    That depends on whether handling of NaNs and Infs creeps into cmath.

    Mainly, I'm just questioning whether there exist compelling use cases
    for parsing NaNs and Infs in the context of complex numbers.

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Dec 31, 2008

    Nice-to-have but not a requirement that the entire input domain be
    supported.

    Ok.

    It made sense for floats because of prevalence of use cases and
    because we wanted to match IEEE-754R as much as possible.

    But why shouldn't this apply to complex numbers ? I am biased because I
    mainly use python for scientific work, but complex numbers are not more
    special than float in my mind.

    Unfortunately, the support for Infs and NaNs has already negatively
    impacted the float world by making the math module harder to maintain
    and test.

    Yes, it is difficult to handle nan and inf, there are a lot of corner
    cases. But I fail to see how this applies here: my patch is essentially
    a rewrote of the parsing, and the code to handle nan/inf is only 7
    lines. This is independent of the handling of how nan/inf is handled by
    math operations on it.

    Mark, does Inf have a standard interpretation for complex numbers?
    Do all infinities meet or do they radiate, each with their own phase >
    angle?

    Hm, not sure what you mean here by standard interpretation, but
    infinities "do not meet", in the sense that (x,inf) and (inf,x) for
    example can never been 'near' from each other (in the distance sense),
    except if x is inf. It does not make more sense to talk about phase of
    complex numbers with inf than for 0. But again, I don't see how this is
    relevant to the proposed feature.

    Mainly, I'm just questioning whether there exist compelling use cases
    for parsing NaNs and Infs in the context of complex numbers.

    For any task where parsing complex makes sense. Since many numerical
    operations may end up with nan or inf, this is relatively common I would
    say.

    It this can make the review easier, I can split the patch in two: first
    the refactoring (+ tests once someone tells me how to), and then
    inf/nan handling (with additional tests).

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Dec 31, 2008

    Ok, I found out how to make tests, and I found some problems while using
    this on numpy. A third version of the patch, with unit tests: all tests
    in test_complex.py pass.

    @mdickinson
    Copy link
    Member

    [Raymond]

    I would
    not like to see that extended to cmath or complex() unless compelling
    real-world use cases arise.

    Hmm. Have you looked at the cmath module recently? You may be in for a
    nasty surprise...

    Mark, does Inf have a standard interpretation for complex numbers? Do
    all infinities meet or do they radiate, each with their own phase
    angle?

    Shrug. Mathematically, by far the most common and useful model is the
    complex plane plus a single extra point at infinity. But when complexes
    are implemented as pairs of floats things look a little strange. Kahan,
    in his 'branch cuts' paper identifies 9 'different' complex infinities
    in this model, and C99 Annex G specifies exactly how functions should
    behave for various different infinities.

    It's never really been clear to me how useful it is to be able to
    distinguish these infinities. But the cmath module *does* make an
    effort to return the 'correct' (according to C99 Annex G) values for
    inputs with a special real or imaginary component. On balance, I'd
    support making complex('nan + nan*j') do the right thing.

    (Though I can't help feeling that the repr of complex(nan, nan)
    should have been 'nan + nan*1j' rather than the current 'nan + nan*j'.)

    Also, do you want to stick with the 754 interpretation of NaNs as the
    result of invalid operations

    I've never much liked the use of NaN to represent missing values, and
    I don't think Python should embrace that usage.

    @mdickinson
    Copy link
    Member

    cdavid, in your application, how hard is it to work around the problem by
    simply printing and parsing pairs of floats rather than complex numbers?

    E.g.,

    <get real part and imaginary part from string>
    z = complex(float(real_part), float(imag_part))

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Jan 2, 2009

    It is not really for an application, but for numpy. Of course, one can
    always get around the problem - but since this is really a limitation
    which can be easily fixed, why not fixing the problem :) ?

    @mdickinson
    Copy link
    Member

    [Mark]

    inputs with a special real or imaginary component. On balance, I'd
    support making complex('nan + nan*j') do the right thing.

    Having thought about this a bit more, I take this back. Some reasons are
    given below.

    [David]

    • complex(repr(..)) roundtrip should work, whatever the value of complex
      is

    I disagree. *Why* do you think it should work? It fails for many other
    types:

    >>> def roundtrip(x): return type(x)(repr(x))
    ... 
    >>> roundtrip("abc")
    "'abc'"
    >>> roundtrip([1,2,3])
    ['[', '1', ',', ' ', '2', ',', ' ', '3', ']']
    >>> roundtrip((1,))
    ('(', '1', ',', ')')
    >>> roundtrip(Fraction(1, 2))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in roundtrip
      File "/Users/dickinsm/python_source/trunk/Lib/fractions.py", line 73, in 
    __new__
        raise ValueError('Invalid literal for Fraction: %r' % input)
    ValueError: Invalid literal for Fraction: 'Fraction(1, 2)'
    >>> roundtrip(Decimal(1))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in roundtrip
      File "/Users/dickinsm/python_source/trunk/Lib/decimal.py", line 545, in 
    __new__
        "Invalid literal for Decimal: %r" % value)
      File "/Users/dickinsm/python_source/trunk/Lib/decimal.py", line 3721, in 
    _raise_error
        raise error(explanation)
    decimal.InvalidOperation: Invalid literal for Decimal: "Decimal('1')"

    In general, type(repr(x)) == x only works for simple types (ints, floats),
    not compound types like lists, strings, tuples, ... I think it's
    reasonable to regard complex as such a compound type.

    There *is* a general Python guideline that eval(repr(x)) == x should work
    where possible. Note that this fails even for floats:

    >>> eval(repr(float('nan')))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 1, in <module>
    NameError: name 'nan' is not defined

    The *real* problem here is that repr(complex) is a mess when it comes to
    niceties like signed zeros, infinities and nans:

    >>> complex(-0., 0.)  # throws away sign of 0 on real part
    0j
    >>> eval(repr(complex(2., -0.)))  # throws away sign on imag. part
    (2+0j)
    >>> nan = float('nan')
    >>> complex(0, nan)
    nan*j
    >>> nan*j  # can't even eval this...
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'j' is not defined
    >>> nan*1j # not much better: real part should be 0, not nan
    (nan+nan*j)
    >>> inf = float('inf')
    >>> inf*1j  # ouch: nan in real part!
    (nan+inf*j)

    I think the *right* solution is to define repr of a complex number
    z to be:

    "complex(%r, %r)" % (z.real, z.imag)

    but that's not possible before the mythical, compatibility-breaking,
    Python 4.0. And even then your desired complex(repr(z)) roundtripping
    wouldn't work, though eval(repr(z)) would.

    While we're daydreaming, another possibility is to follow C99's lead, and
    introduce a type of 'imaginary' numbers, and make 1j be an imaginary
    literal rather than a complex literal. That solves some of the above
    eval(repr(.)) problems. Again, I don't see how this could happen before
    4.0, and I'm not even sure that it's desirable.

    In the circumstances, repr does the best it can, which is to output a
    whole expression which more-or-less describes the given complex number,
    and will usually eval back to the right thing. I think it's not the
    business of the complex() constructor to parse such expressions.

    To summarize: it's a mess. I'd recommend that you and/or numpy don't use
    repr for roundtrip-capable conversions to string. Instead, output the
    real and imaginary parts separately, and use float-from-string and the
    complex-from-pair-of-floats constructors to reconstruct.

    @mdickinson
    Copy link
    Member

    Christian, any comments? Is it okay to close this as a 'won't fix'?

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Jan 6, 2009

    I disagree. *Why* do you think it should work? It fails for many other
    types:

    I don't understand the rationale: why not making something work better
    if possible ? Also, I don't understand the comparison with Decimal or
    Fraction; there is a big difference between making the roundtrip not
    possible at all for a type, and making it possible only for some values.

    I think the *right* solution is to define repr of a complex number
    z to be:

    Why ? I don't understand those examples: maybe I am missing your
    argument, but I understand it as we should not fix one bug because there
    are another bugs related to inf and nan in complex. For me, something like:

    a = complex(0)
    a *= float('inf') * 1j
    -> a = nan+nanj

    Is really not desirable. That's actually a more serious bug than this
    one. Are you saying that python does not care about inf/nan for complex
    numbers ? If so, be it, but it would a bit unfortunate.

    @tiran
    Copy link
    Member Author

    tiran commented Jan 6, 2009

    Definitely! I don't see a reason the make the code more complex.

    David, we are against the patch because we have to keep the code
    maintainable. The code is already long and spreads over lots of lines.
    Your use case isn't common enough to warrant even more complicated
    parsing code.

    @mdickinson
    Copy link
    Member

    As Christian says, it's about not increasing code complexity without a
    good reason. For me, it's also about mental complexity: the set of valid
    inputs to the complex constructor should be small, and should be easy to
    understand and to describe.

    The addition of the '*1j' part of the expression bothers me particularly,
    as does allowing complex('j') to be valid.

    I have a compromise proposal:

    1. For nans and infs, get rid of the * on output: change repr so that
      e.g., repr(complex(0, nan)) is simply 'nanj', and repr(complex(inf,-inf))
      is '(inf-infj)'. (Yes, I know it looks ugly, but bear with me.)

    2. On input, allow strings of the form "complex-string" or "(complex-
      string)" where (in pseudo BNF):

    sign ::= '+' | '-'
    real-part ::= float-string
    imag-part ::= float-string 'j'
    complex-string ::= real-part [sign imag-part] | imag-part

    and float-string is any string currently accepted by float
    (excluding leading or trailing whitespace). Nothing else would be
    permitted. (Well, okay, we still have to allow whitespace around the
    complex string, both inside and outside the parentheses.)

    I think this would simplify the parsing, and remove need for special
    casing of nans and infs. It might even allow the code to become simpler,
    by sharing some of it with stuff in floatobject.c.

    The above would allow double signs: e.g. '2+-1j', with the first sign
    being the one explicitly described above and the second being part of the
    float-string. I don't think this is a terrible thing, if it simplifies
    parsing. It might even be helpful. For example: "%r+%rj" % (z.real,
    z.imag) would always be valid input.

    The current 'nan*j' is certainly more attractive than 'nanj', but it's
    just too suggestive of the actual expression float('nan')*1j, which it
    turns out doesn't produce the same thing.

    If you could come up with a patch that does something like this I'd take a
    look.

    By the way, "0 + inf*1j" giving nan + inf*j isn't really a bug; it's
    pretty much unavoidable: 1j is really complex(0, 1), and when multiplying
    by inf we get complex(inf*0, inf*1). With the usual IEEE 754 semantics
    inf*0 is nan.

    I think it's exactly to fix this sort of behaviour that C99 introduced its
    imaginary type (see Annex G of the standard). It doesn't seem to have
    caught on, though: I don't think gcc implements this, and I'd be very
    surprised if Visual Studio does.

    @cdavid
    Copy link
    Mannequin

    cdavid mannequin commented Jan 6, 2009

    @ Mark

    Concerning float('inf') * 1j: you're right, my rambling did not make any
    sense, sorry.

    I agree that adding complexity may be a good reason to warrant an
    arbitrary feature; actually, I did not manage to handle nan/inf at first
    because of the complexity :) But as I mentioned in a previous comment, I
    think my patch simplifies the parsing as well. I suggested to split the
    patch in to: one whithout any new feature (no handling of nan/inf), and
    then anoter handling nan/inf on the top of it. Would that be acceptable ?

    Also, how should I proceed to share implementation between floatobject.c
    and complexobject.c ? Would creating a new file for the common code be
    acceptable ?

    @rhettinger
    Copy link
    Contributor

    David, at this point, I recommend dropping this one. It has become a
    time waster. Possibly, there is a refactoring of the code that makes
    parsing a bit simpler or shares some common code but it is not worth
    destabilizing the battle-tested code and risking the introduction of
    subtle new parsing errors.

    @rhettinger
    Copy link
    Contributor

    Mark, I'm somewhat uncomfortable with your proposal also. It changes
    what the parser will accept and the potential benefits are almost nil.
    Also, putting NaNs and Infs in complex numbers is probably not something
    that should be pursued. I see no utility in pairs like (NaN, 3.0) or
    (-2.0, Inf). The whole request is use case challenged. AFAICT, the
    current implementation suffice for real cases and there is no compelling
    need to rearrange and redesign this code.

    @mdickinson
    Copy link
    Member

    Closing this one as won't fix.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants