Message 79266 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	cdavid, christian.heimes, mark.dickinson, rhettinger
Date	2009-01-06.15:44:28
SpamBayes Score	0.0013961826
Marked as misclassified	No
Message-id	<1231256671.19.0.950298096215.issue2121@psf.upfronthosting.co.za>
In-reply-to

Content
As Christian says, it's about not increasing code complexity without a good reason. For me, it's also about mental complexity: the set of valid inputs to the complex constructor should be small, and should be easy to understand and to describe. The addition of the '1j' part of the expression bothers me particularly, as does allowing complex('j') to be valid. I have a compromise proposal: 1. For nans and infs, get rid of the on output: change repr so that e.g., repr(complex(0, nan)) is simply 'nanj', and repr(complex(inf,-inf)) is '(inf-infj)'. (Yes, I know it looks ugly, but bear with me.) 2. On input, allow strings of the form "complex-string" or "(complex- string)" where (in pseudo BNF): sign ::= '+' \| '-' real-part ::= float-string imag-part ::= float-string 'j' complex-string ::= real-part [sign imag-part] \| imag-part and float-string is any string currently accepted by float (excluding leading or trailing whitespace). Nothing else would be permitted. (Well, okay, we still have to allow whitespace around the complex string, both inside and outside the parentheses.) I think this would simplify the parsing, and remove need for special casing of nans and infs. It might even allow the code to become simpler, by sharing some of it with stuff in floatobject.c. The above would allow double signs: e.g. '2+-1j', with the first sign being the one explicitly described above and the second being part of the float-string. I don't think this is a terrible thing, if it simplifies parsing. It might even be helpful. For example: "%r+%rj" % (z.real, z.imag) would always be valid input. The current 'nanj' is certainly more attractive than 'nanj', but it's just too suggestive of the actual expression float('nan')1j, which it turns out doesn't produce the same thing. If you could come up with a patch that does something like this I'd take a look. By the way, "0 + inf1j" giving nan + infj isn't really a bug; it's pretty much unavoidable: 1j is really complex(0, 1), and when multiplying by inf we get complex(inf0, inf1). With the usual IEEE 754 semantics inf*0 is nan. I think it's exactly to fix this sort of behaviour that C99 introduced its imaginary type (see Annex G of the standard). It doesn't seem to have caught on, though: I don't think gcc implements this, and I'd be very surprised if Visual Studio does.

As Christian says, it's about not increasing code complexity without a 
good reason.  For me, it's also about mental complexity:  the set of valid 
inputs to the complex constructor should be small, and should be easy to 
understand and to describe.

The addition of the '*1j' part of the expression bothers me particularly,
as does allowing complex('j') to be valid.


I have a compromise proposal:

1. For nans and infs, get rid of the * on output:  change repr so that 
e.g., repr(complex(0, nan)) is simply 'nanj', and repr(complex(inf,-inf)) 
is '(inf-infj)'.  (Yes, I know it looks ugly, but bear with me.)

2. On input, allow strings of the form "complex-string" or "(complex-
string)" where (in pseudo BNF):

sign ::= '+' | '-'
real-part ::= float-string
imag-part ::= float-string 'j'
complex-string ::= real-part [sign imag-part] | imag-part

and float-string is any string currently accepted by float
(excluding leading or trailing whitespace).  Nothing else would be 
permitted.  (Well, okay, we still have to allow whitespace around the 
complex string, both inside and outside the parentheses.)

I think this would simplify the parsing, and remove need for special 
casing of nans and infs.  It might even allow the code to become simpler, 
by sharing some of it with stuff in floatobject.c.

The above would allow double signs:  e.g. '2+-1j', with the first sign 
being the one explicitly described above and the second being part of the 
float-string.  I don't think this is a terrible thing, if it simplifies 
parsing.  It might even be helpful.  For example: "%r+%rj" % (z.real, 
z.imag) would always be valid input.

The current 'nan*j' is certainly more attractive than 'nanj', but it's 
just too suggestive of the actual expression float('nan')*1j, which it 
turns out doesn't produce the same thing.

If you could come up with a patch that does something like this I'd take a 
look.


By the way, "0 + inf*1j" giving nan + inf*j isn't really a bug;  it's 
pretty much unavoidable:  1j is really complex(0, 1), and when multiplying 
by inf we get complex(inf*0, inf*1).  With the usual IEEE 754 semantics 
inf*0 is nan.

I think it's exactly to fix this sort of behaviour that C99 introduced its 
imaginary type (see Annex G of the standard).  It doesn't seem to have 
caught on, though:  I don't think gcc implements this, and I'd be very 
surprised if Visual Studio does.

History
Date	User	Action	Args
2009-01-06 15:44:31	mark.dickinson	set	recipients: + mark.dickinson, rhettinger, christian.heimes, cdavid
2009-01-06 15:44:31	mark.dickinson	set	messageid: <1231256671.19.0.950298096215.issue2121@psf.upfronthosting.co.za>
2009-01-06 15:44:30	mark.dickinson	link	issue2121 messages
2009-01-06 15:44:29	mark.dickinson	create