New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow whitespace around central '+' in complex constructor #53783
Comments
complex() raises ValueError when parsing a string argument containing both real and imaginary where one of the real or imaginary is a decimal. To reproduce: >>> complex("1.1 + 2.1j")
ValueError: complex() arg is a malformed string
>>> complex("2.1j")
2.1j
>>> complex("1.1 + 2j")
ValueError: complex() arg is a malformed string
>>> complex("1 + 2.1j")
ValueError: complex() arg is a malformed string Expected results: >>> complex("1.1 + 2.1j")
(1.1 + 2.1j)
>>> complex("2.1j")
2.1j
>>> complex("1.1 + 2j")
(1.1 + 2j)
>>> complex("1 + 2.1j")
(1 + 2.1j) This affects all versions up to Python 3.1.2. I haven't tested any of the development builds. Tests were conducted on a Windows XP 32 bit machine. |
The problem here is the spaces in the input string: newton:~ dickinsm$ python2.7
Python 2.7 (r27:82500, Jul 13 2010, 14:10:05)
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> complex("1.1+2.1j")
(1.1+2.1j) The current behaviour is by design, so I'm changing to feature request. It may make sense to consider allowing whitespace around the central '+' or '-', though this would mildly complicate the parsing. I'd be +0 on this change. |
Note also that spaces are already allowed immediately inside the parentheses in a string argument to the complex constructor (in python 2.6 and later): >>> complex("( 1.1+2.1j )")
(1.1000000000000001+2.1000000000000001j) |
I'm wondering whether the moratorium (PEP-3003) applies to this; from a close reading I'd say it does. At any rate, it seems like an inessential enhancement, so I'd be happy to delay this until 3.3. |
Current behavior is also consistent with that of fractions: >>> Fraction("1/2")
Fraction(1, 2)
>>> Fraction("1 / 2")
Traceback (most recent call last):
..
ValueError: Invalid literal for Fraction: '1 / 2' I am -1 on this RFE. At most, this can be clarified in the docs. Allowing whitespace involves too much uncertainly. Would you allow any white space or just chr(0x20)? End-of-line or tab in complex numbers is most likely a typo and should be flagged. What about more exotic unicode whitespace such as chr(0x00A0) or chr(0x2009)? Allow one or any number of whitespace characters? Users who need a more powerful parser can use eval() or simply remove spaces from their strings before converting them to numbers. |
I did some experimentation and found some inconsistency between int and complex: >>> int('\xA11')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa1 in position 0: invalid start byte
but
>>> complex('\xA11')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: complex() arg is a malformed string The int behavior is probably a bug that should be reported separately. |
I don't think determining *which* whitespace is allowed is a problem; just use whatever's already being used for the whitespace that's already allowed (around the whole complex input, for example, or between the optional parentheses and the number). Please open a separate bug report for the UnicodeDecodeError. Though I have a suspicion/vague recollection that this has already come up somewhere in the tracker... |
It hadn't occurred to me to try this without spaces. Thank you for pointing this out. Agreed that the enhancement is not essential. |
Unassigning myself from this one, though I'll review a patch if anyone wants to write one. After thinking about it a bit, I'm -0 on allowing the extra whitespace. The main issue for me is that it opens up a can of worms about what should and shouldn't be allowed. Which of the following should be allowed: (a) complex("0.1 + 3j") Any patch would presumably allow (a). (b) looks like it *should* be allowed, too, but then by analogy so does (c). But for consistency, (d) and (e) would then have to be allowed, and I *really* don't want to go that far; in particular, there are rules about what's allowed as a floating-point string that are fairly consistently applied throughout Python (e.g., in the float, Decimal and Fraction constructors); these rules also agree with accepted standards (e.g., C99, IEEE 754), which clearly don't allow a space between the optional sign and the body of the float. So unless anyone particularly wants to pursue this, I'd suggest closing as "won't fix". |
If someone does want to produce a patch, here's the grammar that I suggest, in pseudo BNF form. This would be reasonably simple to implement, and is also simple to describe. whitespace = ' ' | '\t' | '\n' | '\v' | '\f' # include other non-ASCII whitespace? where float_string is any string that (a) doesn't contain leading or trailing whitespace, and (b) is accepted by the current float constructor. This would allow (a) and (f) in the previous message, but not (b) or (c). |
I can write a documentation patch for this: http://docs.python.org/library/functions.html?highlight=complex#complex to highlight the expected format of the string argument. As others have pointed out here, there are a number of other options available to correctly parse the complex string argument:
I think that the current options are sufficient that a patch to apply new behaviour isn't required. |
Here is a patch to document string argument requirements. |
Here's a patch (targeting 3.3) for allowing whitespace around the central binary operator; it implements the grammar suggested in msg115163. |
Reclassifying as a doc issue; I don't think my proposed change is worth it. I'll submit some form of Jervis's docfix shortly. |
New changeset 5a3c89337b50 by Mark Dickinson in branch '2.7': New changeset a5b073b1cfea by Mark Dickinson in branch '3.2': New changeset 2f48415e917c by Mark Dickinson in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: