classification
Title: allow whitespace around central '+' in complex constructor
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: belopolsky, jdwhitley, mark.dickinson, python-dev
Priority: low Keywords: patch

Created on 2010-08-12 07:12 by jdwhitley, last changed 2012-03-10 16:15 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
complex_doc.diff jdwhitley, 2010-08-31 09:27 Patch of library/function.rst docs for complex function
issue9574.patch mark.dickinson, 2010-11-21 10:10 Allow whitespace around central '+' or '-' in complex constructor. review
Messages (15)
msg113661 - (view) Author: Jervis Whitley (jdwhitley) Date: 2010-08-12 07:12
complex() raises ValueError when parsing a string argument containing both real and imaginary where one of the real or imaginary is a decimal.

To reproduce:

>>> complex("1.1 + 2.1j")
ValueError: complex() arg is a malformed string

>>> complex("2.1j")
2.1j

>>> complex("1.1 + 2j")
ValueError: complex() arg is a malformed string

>>> complex("1 + 2.1j")
ValueError: complex() arg is a malformed string 

Expected results:

>>> complex("1.1 + 2.1j")
(1.1 + 2.1j)

>>> complex("2.1j")
2.1j

>>> complex("1.1 + 2j")
(1.1 + 2j)

>>> complex("1 + 2.1j")
(1 + 2.1j)

This affects all versions up to Python 3.1.2. I haven't tested any of the development builds.

Tests were conducted on a Windows XP 32 bit machine.
msg113664 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-12 07:52
The problem here is the spaces in the input string:

newton:~ dickinsm$ python2.7
Python 2.7 (r27:82500, Jul 13 2010, 14:10:05) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> complex("1.1+2.1j")
(1.1+2.1j)

The current behaviour is by design, so I'm changing to feature request.  It may make sense to consider allowing whitespace around the central '+' or '-', though this would mildly complicate the parsing.

I'd be +0 on this change.
msg113665 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-12 07:54
Note also that spaces are already allowed immediately inside the parentheses in a string argument to the complex constructor (in python 2.6 and later):


>>> complex("( 1.1+2.1j )")
(1.1000000000000001+2.1000000000000001j)
msg113676 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-12 12:47
I'm wondering whether the moratorium (PEP 3003) applies to this; from a close reading I'd say it does.  At any rate, it seems like an inessential enhancement, so I'd be happy to delay this until 3.3.
msg113689 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-08-12 17:18
Current behavior is also consistent with that of fractions:

>>> Fraction("1/2")
Fraction(1, 2)
>>> Fraction("1 / 2")
Traceback (most recent call last):
  ..
ValueError: Invalid literal for Fraction: '1 / 2'

I am -1 on this RFE.  At most, this can be clarified in the docs.

Allowing whitespace involves too much uncertainly.  Would you allow any white space or just chr(0x20)?  End-of-line or tab in complex numbers is most likely a typo and should be flagged.  What about more exotic unicode whitespace such as chr(0x00A0) or chr(0x2009)?  Allow one or any number of whitespace characters?

Users who need a more powerful parser can use eval() or simply remove spaces from their strings before converting them to numbers.
msg113692 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-08-12 17:36
I did some experimentation and found some inconsistency between int and complex:

>>> int('\xA11')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa1 in position 0: invalid start byte

but
>>> complex('\xA11')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: complex() arg is a malformed string

The int behavior is probably a bug that should be reported separately.
msg113693 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-12 17:50
I don't think determining *which* whitespace is allowed is a problem; just use whatever's already being used for the whitespace that's already allowed (around the whole complex input, for example, or between the optional parentheses and the number).

Please open a separate bug report for the UnicodeDecodeError.  Though I have a suspicion/vague recollection that this has already come up somewhere in the tracker...
msg113721 - (view) Author: Jervis Whitley (jdwhitley) Date: 2010-08-12 23:08
It hadn't occurred to me to try this without spaces. Thank you for pointing this out. Agreed that the enhancement is not essential.
msg115161 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-28 17:38
Unassigning myself from this one, though I'll review a patch if anyone wants to write one.

After thinking about it a bit, I'm -0 on allowing the extra whitespace.  The main issue for me is that it opens up a can of worms about what should and shouldn't be allowed.  Which of the following should be allowed:

(a) complex("0.1 + 3j")
(b) complex("+ 3j")
(c) complex("+ 3")
(d) float("- 3")
(e) int("+ 3")
(f) complex("+4.0 + -5.0j")

Any patch would presumably allow (a).  (b) looks like it *should* be allowed, too, but then by analogy so does (c).  But for consistency, (d) and (e) would then have to be allowed, and I *really* don't want to go that far;  in particular, there are rules about what's allowed as a floating-point string that are fairly consistently applied throughout Python (e.g., in the float, Decimal and Fraction constructors);  these rules also agree with accepted standards (e.g., C99, IEEE 754), which clearly don't allow a space between the optional sign and the body of the float.

So unless anyone particularly wants to pursue this, I'd suggest closing as "won't fix".
msg115163 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-08-28 17:54
If someone does want to produce a patch, here's the grammar that I suggest, in pseudo BNF form.  This would be reasonably simple to implement, and is also simple to describe.

whitespace = ' ' | '\t' | '\n' | '\v' | '\f'    # include other non-ASCII whitespace?
binop = [whitespace] ('+' | '-') [whitespace]
imag_marker = 'j' | 'J'
complex_string = float_string binop float_string imag_marker
               | float_string imag_marker
               | float_string
padded_complex_string = [whitespace] complex_string [whitespace]
complex_constructor_input = padded_complex_string
                          | [whitespace] '(' padded_complex_string ')' [whitespace]

where float_string is any string that (a) doesn't contain leading or trailing whitespace, and (b) is accepted by the current float constructor.

This would allow (a) and (f) in the previous message, but not (b) or (c).
msg115174 - (view) Author: Jervis Whitley (jdwhitley) Date: 2010-08-29 01:41
I can write a documentation patch for this:

http://docs.python.org/library/functions.html?highlight=complex#complex

to highlight the expected format of the string argument.

As others have pointed out here, there are a number of other options available to correctly parse the complex string argument:

 * using eval where appropriate; and
 * preprocessing to remove whitespace.

I think that the current options are sufficient that a patch to apply new behaviour isn't required.
msg115255 - (view) Author: Jervis Whitley (jdwhitley) Date: 2010-08-31 09:27
Here is a patch to document string argument requirements.
msg121895 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-11-21 10:10
Here's a patch (targeting 3.3) for allowing whitespace around the central binary operator; it implements the grammar suggested in msg115163.
msg155319 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-03-10 16:08
Reclassifying as a doc issue;  I don't think my proposed change is worth it.  I'll submit some form of Jervis's docfix shortly.
msg155320 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-10 16:15
New changeset 5a3c89337b50 by Mark Dickinson in branch '2.7':
Closes #9574: Note that complex constructor doesn't allow whitespace around central operator.
http://hg.python.org/cpython/rev/5a3c89337b50

New changeset a5b073b1cfea by Mark Dickinson in branch '3.2':
Closes #9574: Note that complex constructor doesn't allow whitespace around central operator.
http://hg.python.org/cpython/rev/a5b073b1cfea

New changeset 2f48415e917c by Mark Dickinson in branch 'default':
merge 3.2 (#9574)
http://hg.python.org/cpython/rev/2f48415e917c
History
Date User Action Args
2012-03-10 16:15:39python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg155320

resolution: fixed
stage: patch review -> resolved
2012-03-10 16:08:24mark.dickinsonsetpriority: normal -> low

messages: + msg155319
components: + Documentation, - Interpreter Core
2011-10-20 13:44:12mark.dickinsonsetstage: needs patch -> patch review
2010-11-21 10:10:23mark.dickinsonsetfiles: + issue9574.patch
assignee: mark.dickinson
messages: + msg121895
2010-08-31 09:27:33jdwhitleysetfiles: + complex_doc.diff
keywords: + patch
messages: + msg115255
2010-08-29 01:41:21jdwhitleysetmessages: + msg115174
2010-08-28 20:37:58mark.dickinsonsettitle: complex does not parse strings containing decimals -> allow whitespace around central '+' in complex constructor
2010-08-28 17:54:49mark.dickinsonsetmessages: + msg115163
2010-08-28 17:38:09mark.dickinsonsetassignee: mark.dickinson -> (no value)
messages: + msg115161
2010-08-12 23:08:20jdwhitleysetmessages: + msg113721
2010-08-12 17:50:16mark.dickinsonsetmessages: + msg113693
2010-08-12 17:36:14belopolskysetmessages: + msg113692
2010-08-12 17:18:49belopolskysetnosy: + belopolsky
messages: + msg113689
2010-08-12 12:47:12mark.dickinsonsetmessages: + msg113676
versions: + Python 3.3, - Python 3.2
2010-08-12 08:00:32mark.dickinsonsetstage: needs patch
versions: + Python 3.2, - Python 2.6, Python 2.5, Python 3.1, Python 2.7
2010-08-12 07:54:53mark.dickinsonsetmessages: + msg113665
2010-08-12 07:52:24mark.dickinsonsetnosy: + mark.dickinson
messages: + msg113664

assignee: mark.dickinson
type: behavior -> enhancement
2010-08-12 07:12:41jdwhitleycreate