msg59642 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-10 01:01 |
After seeing issue #1761, I realized that there's a bug in the Decimal
constructor: it accepts newline-terminated strings:
>>> from decimal import *
>>> s = "2.3\n"
>>> Decimal(s)
Decimal("2.3")
I think this is, strictly speaking, a bug because:
(1) The IBM decimal specification explicitly disallows additional whitespace
in a numeric string (see http://www2.hursley.ibm.com/decimal/daconvs.html),
(2) the operation to-number is supposed only to accept numeric strings, and
(3) Decimal.__new__ is currently the method that implements to-number.
Is this worth fixing? This buggy behaviour might well be useful (e.g. to
someone parsing a file with one Decimal per line).
I'll fix it if anyone thinks it's worth it. Even if it should be fixed, I
don't think this is worth backporting to Python 2.5, especially since it
might break things.
|
msg59643 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-10 01:04 |
I'd propose doing the same thing as int() and float(), which accept and
strip leading and trailing whitespace.
|
msg59644 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-01-10 01:20 |
I concur with Guido. The to-number operation is part of the spec. The
constructor belongs to us and should match the rest of the language.
|
msg59646 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-10 01:28 |
Okay: in that case, I propose:
(1) adding a to_number method to the Decimal and Context class, so that
we're still in full compliance with the specification, and
(2) altering Decimal.__new__ to allow trailing and leading whitespace.
|
msg59647 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-10 01:50 |
Aargh! It's only the Context class that needs a to_number method---it
makes no sense as a Decimal method. And to_number is almost already
there, in the form of Context.create_decimal.
I'll work out exactly what the minimum is that needs to be done to
comply with the specification, and post a patch
|
msg59654 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-10 03:10 |
In the spirit of making the Decimal constructor match the rest of the language,
can I propose another change: make Decimal("not a decimal") raise a
ValueError.
Currently, Decimal("not a decimal") either raises InvalidOperation or returns a
Decimal NaN, depending on whether the InvalidOperation trap is set in the
current context or not. This behaviour presumably again comes directly from
the specification, but as pointed out already Decimal.__new__ doesn't need to
rigidly follow the specification---we just need to make sure that the
functionality of to-number is present somewhere (and create_decimal seems like
a good candidate for that). It seems to me that:
Decimal.__new__ shouldn't care about the current context.
The fact that __new__ takes a context parameter at the moment is potentially
confusing: one might reasonably expect that passing a context to __new__ would
result in the Decimal being rounded to fit that context (as happens with
create_decimal). In fact, the *only* reason the context argument is there
because it's needed to raise InvalidOperation; if the first argument to
__new__ is valid then the context is entirely unused.
|
msg59655 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-01-10 03:29 |
I think the decimal exceptions should continue to be raised as-is.
There is a well-defined and thought-out structure for the Decimal
exceptions. Also, the constructor's exceptions tend to be raised in an
environment where there are other decimal operations taking place -- it
would be a shame to have to catch both types of exception.
I spent about a month working on this module and put a decent amount of
thought into integrating the API. I would prefer to not revisit all of
those decisions one at a time.
Instead, let's graft on any new magic methods and fix straight-out bugs.
Having lots of little microscopic API changes will harm more than it
will help. Why introduce incompatabilities for no gain.
I'm inclined to dismiss this whole bug report. It doesn't help anything
to muck with the existing choices about whitespace handling. Recommend
closing this as yagni / don't-care.
|
msg59660 - (view) |
Author: Facundo Batista (facundobatista) * |
Date: 2008-01-10 12:09 |
I think that the Spec disallows additional whitespace to not allow, for
example, "2. 34" as "2.34", or "10 e-12".
I don't see any harm in having " 2.34" or "5.73\n" as good input
values, as long we remove those characters at both ends.
I propose to just make a .strip() at the input string, change the
documentation to reflect this, and that's all.
The doc says: "If value is a string, it should conform to the decimal
numeric string syntax:"
We can put: "If value is a string, it should conform to the decimal
numeric string syntax after being stripped on both ends:"
|
msg59687 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-11 01:13 |
Allowing leading and trailing whitespace causes failures in Cowlishaw's
test suite. I guess Raymond's right: this bug report should be
dismissed. It still bothers me a little that there isn't a strictly
conforming implementation of to-number in decimal, but I can live with it.
|
msg59688 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-11 01:16 |
> Allowing leading and trailing whitespace causes failures in Cowlishaw's
> test suite. I guess Raymond's right: this bug report should be
> dismissed. It still bothers me a little that there isn't a strictly
> conforming implementation of to-number in decimal, but I can live with it.
Not sure what you mean. Can't you fix the code to reject the trailing
\n? Or are we talking about something else now?
|
msg59689 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-11 01:49 |
I can certainly fix the Decimal constructor to reject trailing newlines, if
that's what people want, but that doesn't fit with the proposal to accept
and strip leading and trailing whitespace.
The other option that maintains full compliance with the specification is
to do what we like with Decimal.__new__ (e.g. allowing leading and trailing
whitespace), but make sure that there's a fully conforming to-number
elsewhere in the Decimal module.
I'm happy to make either of the above fixes, or to do nothing (which leaves
us with a slightly buggy implementation of the specification---but probably
not so that anyone would notice). I'll leave the decision of which of
these three options is most desirable to the more seasoned developers.
|
msg59690 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-11 01:55 |
I'd say that accepting a trailing \n is a bug that should be fixed.
I think that ideally we'd be allowed to specify whitespace around the
value. I am always annoyed at programs that require me to type e.g. an
integer and don't let me put spaces before and/or after (which often
happen due to copy/paste).
See also bug #1779 BTW.
|
msg59703 - (view) |
Author: Facundo Batista (facundobatista) * |
Date: 2008-01-11 11:26 |
Mark Dickinson:
> The other option that maintains full compliance with the
> specification is to do what we like with Decimal.__new__
> (e.g. allowing leading and trailing whitespace), but
> make sure that there's a fully conforming to-number
> elsewhere in the Decimal module.
I'm +1 to be more permissive in the __new__ regarding spaces, tabs, and
even newlines.
I'm -0 to add an special module that does not allow this. I don't see
the value of it more than be compliant to the standard in that
particular sentence.
Guido van Rossum:
> I'd say that accepting a trailing \n is a bug that should
> be fixed.
>
> I think that ideally we'd be allowed to specify whitespace
> around the value. I am always annoyed at programs that require
Do you want to be able to do Decimal("3 ") but not Decimal("3\n")?
That confused me, I always read "whitespace" here as all these
characters, as in the string module:
>>> string.whitespace
'\t\n\x0b\x0c\r '
To be more practical about my point, I'm +1 to do a .strip() in __new__
before parsing the number.
|
msg59723 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-11 17:30 |
> Do you want to be able to do Decimal("3 ") but not Decimal("3\n")?
I want either both or none, with a slight preference for both but only
if it can be done without breaking the spec.
The status quo is that "3 " is refused but "3\n" is accepted; that seems
wrong.
|
msg59770 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-12 01:15 |
Here's a patch that alters the Decimal constructor to allow leading and
trailing whitespace.
The Context.create_decimal method should now be a fully conforming
implementation of to-number: it doesn't accept any leading or trailing
whitespace.
|
msg59771 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-01-12 01:30 |
Please apply the patch and close the bug.
Thx
|
msg59776 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2008-01-12 01:56 |
Committed, revision 59929.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:29 | admin | set | github: 46113 |
2008-01-12 03:31:57 | mark.dickinson | link | issue1516613 superseder |
2008-01-12 01:56:40 | mark.dickinson | set | status: open -> closed messages:
+ msg59776 |
2008-01-12 01:30:39 | rhettinger | set | resolution: accepted messages:
+ msg59771 |
2008-01-12 01:15:21 | mark.dickinson | set | keywords:
+ patch files:
+ issue1780.patch messages:
+ msg59770 |
2008-01-11 17:30:31 | gvanrossum | set | messages:
+ msg59723 |
2008-01-11 11:26:02 | facundobatista | set | messages:
+ msg59703 |
2008-01-11 01:55:16 | gvanrossum | set | messages:
+ msg59690 |
2008-01-11 01:49:26 | mark.dickinson | set | messages:
+ msg59689 |
2008-01-11 01:16:28 | gvanrossum | set | messages:
+ msg59688 |
2008-01-11 01:13:32 | mark.dickinson | set | messages:
+ msg59687 |
2008-01-10 12:09:46 | facundobatista | set | messages:
+ msg59660 |
2008-01-10 03:29:38 | rhettinger | set | messages:
+ msg59655 |
2008-01-10 03:10:35 | mark.dickinson | set | messages:
+ msg59654 |
2008-01-10 01:50:10 | mark.dickinson | set | messages:
+ msg59647 |
2008-01-10 01:28:53 | mark.dickinson | set | messages:
+ msg59646 |
2008-01-10 01:20:48 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg59644 |
2008-01-10 01:04:40 | gvanrossum | set | nosy:
+ gvanrossum messages:
+ msg59643 |
2008-01-10 01:01:45 | mark.dickinson | create | |