msg113661 - (view) |
Author: Jervis Whitley (jdwhitley) |
Date: 2010-08-12 07:12 |
complex() raises ValueError when parsing a string argument containing both real and imaginary where one of the real or imaginary is a decimal.
To reproduce:
>>> complex("1.1 + 2.1j")
ValueError: complex() arg is a malformed string
>>> complex("2.1j")
2.1j
>>> complex("1.1 + 2j")
ValueError: complex() arg is a malformed string
>>> complex("1 + 2.1j")
ValueError: complex() arg is a malformed string
Expected results:
>>> complex("1.1 + 2.1j")
(1.1 + 2.1j)
>>> complex("2.1j")
2.1j
>>> complex("1.1 + 2j")
(1.1 + 2j)
>>> complex("1 + 2.1j")
(1 + 2.1j)
This affects all versions up to Python 3.1.2. I haven't tested any of the development builds.
Tests were conducted on a Windows XP 32 bit machine.
|
msg113664 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-12 07:52 |
The problem here is the spaces in the input string:
newton:~ dickinsm$ python2.7
Python 2.7 (r27:82500, Jul 13 2010, 14:10:05)
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> complex("1.1+2.1j")
(1.1+2.1j)
The current behaviour is by design, so I'm changing to feature request. It may make sense to consider allowing whitespace around the central '+' or '-', though this would mildly complicate the parsing.
I'd be +0 on this change.
|
msg113665 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-12 07:54 |
Note also that spaces are already allowed immediately inside the parentheses in a string argument to the complex constructor (in python 2.6 and later):
>>> complex("( 1.1+2.1j )")
(1.1000000000000001+2.1000000000000001j)
|
msg113676 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-12 12:47 |
I'm wondering whether the moratorium (PEP 3003) applies to this; from a close reading I'd say it does. At any rate, it seems like an inessential enhancement, so I'd be happy to delay this until 3.3.
|
msg113689 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2010-08-12 17:18 |
Current behavior is also consistent with that of fractions:
>>> Fraction("1/2")
Fraction(1, 2)
>>> Fraction("1 / 2")
Traceback (most recent call last):
..
ValueError: Invalid literal for Fraction: '1 / 2'
I am -1 on this RFE. At most, this can be clarified in the docs.
Allowing whitespace involves too much uncertainly. Would you allow any white space or just chr(0x20)? End-of-line or tab in complex numbers is most likely a typo and should be flagged. What about more exotic unicode whitespace such as chr(0x00A0) or chr(0x2009)? Allow one or any number of whitespace characters?
Users who need a more powerful parser can use eval() or simply remove spaces from their strings before converting them to numbers.
|
msg113692 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2010-08-12 17:36 |
I did some experimentation and found some inconsistency between int and complex:
>>> int('\xA11')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa1 in position 0: invalid start byte
but
>>> complex('\xA11')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: complex() arg is a malformed string
The int behavior is probably a bug that should be reported separately.
|
msg113693 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-12 17:50 |
I don't think determining *which* whitespace is allowed is a problem; just use whatever's already being used for the whitespace that's already allowed (around the whole complex input, for example, or between the optional parentheses and the number).
Please open a separate bug report for the UnicodeDecodeError. Though I have a suspicion/vague recollection that this has already come up somewhere in the tracker...
|
msg113721 - (view) |
Author: Jervis Whitley (jdwhitley) |
Date: 2010-08-12 23:08 |
It hadn't occurred to me to try this without spaces. Thank you for pointing this out. Agreed that the enhancement is not essential.
|
msg115161 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-28 17:38 |
Unassigning myself from this one, though I'll review a patch if anyone wants to write one.
After thinking about it a bit, I'm -0 on allowing the extra whitespace. The main issue for me is that it opens up a can of worms about what should and shouldn't be allowed. Which of the following should be allowed:
(a) complex("0.1 + 3j")
(b) complex("+ 3j")
(c) complex("+ 3")
(d) float("- 3")
(e) int("+ 3")
(f) complex("+4.0 + -5.0j")
Any patch would presumably allow (a). (b) looks like it *should* be allowed, too, but then by analogy so does (c). But for consistency, (d) and (e) would then have to be allowed, and I *really* don't want to go that far; in particular, there are rules about what's allowed as a floating-point string that are fairly consistently applied throughout Python (e.g., in the float, Decimal and Fraction constructors); these rules also agree with accepted standards (e.g., C99, IEEE 754), which clearly don't allow a space between the optional sign and the body of the float.
So unless anyone particularly wants to pursue this, I'd suggest closing as "won't fix".
|
msg115163 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-08-28 17:54 |
If someone does want to produce a patch, here's the grammar that I suggest, in pseudo BNF form. This would be reasonably simple to implement, and is also simple to describe.
whitespace = ' ' | '\t' | '\n' | '\v' | '\f' # include other non-ASCII whitespace?
binop = [whitespace] ('+' | '-') [whitespace]
imag_marker = 'j' | 'J'
complex_string = float_string binop float_string imag_marker
| float_string imag_marker
| float_string
padded_complex_string = [whitespace] complex_string [whitespace]
complex_constructor_input = padded_complex_string
| [whitespace] '(' padded_complex_string ')' [whitespace]
where float_string is any string that (a) doesn't contain leading or trailing whitespace, and (b) is accepted by the current float constructor.
This would allow (a) and (f) in the previous message, but not (b) or (c).
|
msg115174 - (view) |
Author: Jervis Whitley (jdwhitley) |
Date: 2010-08-29 01:41 |
I can write a documentation patch for this:
http://docs.python.org/library/functions.html?highlight=complex#complex
to highlight the expected format of the string argument.
As others have pointed out here, there are a number of other options available to correctly parse the complex string argument:
* using eval where appropriate; and
* preprocessing to remove whitespace.
I think that the current options are sufficient that a patch to apply new behaviour isn't required.
|
msg115255 - (view) |
Author: Jervis Whitley (jdwhitley) |
Date: 2010-08-31 09:27 |
Here is a patch to document string argument requirements.
|
msg121895 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-11-21 10:10 |
Here's a patch (targeting 3.3) for allowing whitespace around the central binary operator; it implements the grammar suggested in msg115163.
|
msg155319 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-03-10 16:08 |
Reclassifying as a doc issue; I don't think my proposed change is worth it. I'll submit some form of Jervis's docfix shortly.
|
msg155320 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-03-10 16:15 |
New changeset 5a3c89337b50 by Mark Dickinson in branch '2.7':
Closes #9574: Note that complex constructor doesn't allow whitespace around central operator.
http://hg.python.org/cpython/rev/5a3c89337b50
New changeset a5b073b1cfea by Mark Dickinson in branch '3.2':
Closes #9574: Note that complex constructor doesn't allow whitespace around central operator.
http://hg.python.org/cpython/rev/a5b073b1cfea
New changeset 2f48415e917c by Mark Dickinson in branch 'default':
merge 3.2 (#9574)
http://hg.python.org/cpython/rev/2f48415e917c
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:05 | admin | set | github: 53783 |
2012-03-10 16:15:39 | python-dev | set | status: open -> closed
nosy:
+ python-dev messages:
+ msg155320
resolution: fixed stage: patch review -> resolved |
2012-03-10 16:08:24 | mark.dickinson | set | priority: normal -> low
messages:
+ msg155319 components:
+ Documentation, - Interpreter Core |
2011-10-20 13:44:12 | mark.dickinson | set | stage: needs patch -> patch review |
2010-11-21 10:10:23 | mark.dickinson | set | files:
+ issue9574.patch assignee: mark.dickinson messages:
+ msg121895
|
2010-08-31 09:27:33 | jdwhitley | set | files:
+ complex_doc.diff keywords:
+ patch messages:
+ msg115255
|
2010-08-29 01:41:21 | jdwhitley | set | messages:
+ msg115174 |
2010-08-28 20:37:58 | mark.dickinson | set | title: complex does not parse strings containing decimals -> allow whitespace around central '+' in complex constructor |
2010-08-28 17:54:49 | mark.dickinson | set | messages:
+ msg115163 |
2010-08-28 17:38:09 | mark.dickinson | set | assignee: mark.dickinson -> (no value) messages:
+ msg115161 |
2010-08-12 23:08:20 | jdwhitley | set | messages:
+ msg113721 |
2010-08-12 17:50:16 | mark.dickinson | set | messages:
+ msg113693 |
2010-08-12 17:36:14 | belopolsky | set | messages:
+ msg113692 |
2010-08-12 17:18:49 | belopolsky | set | nosy:
+ belopolsky messages:
+ msg113689
|
2010-08-12 12:47:12 | mark.dickinson | set | messages:
+ msg113676 versions:
+ Python 3.3, - Python 3.2 |
2010-08-12 08:00:32 | mark.dickinson | set | stage: needs patch versions:
+ Python 3.2, - Python 2.6, Python 2.5, Python 3.1, Python 2.7 |
2010-08-12 07:54:53 | mark.dickinson | set | messages:
+ msg113665 |
2010-08-12 07:52:24 | mark.dickinson | set | nosy:
+ mark.dickinson messages:
+ msg113664
assignee: mark.dickinson type: behavior -> enhancement |
2010-08-12 07:12:41 | jdwhitley | create | |