classification
Title: Make Decimal constructor accept all unicode decimal digits in input.
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: eric.smith, ezio.melotti, mark.dickinson, rhettinger
Priority: normal Keywords: patch

Created on 2009-07-29 08:39 by mark.dickinson, last changed 2009-08-02 11:03 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
issue6595.patch mark.dickinson, 2009-07-29 09:06
Messages (9)
msg91030 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-07-29 08:39
Ezio Melotti asked (on #python-dev) why the Decimal constructor doesn't 
accept decimal digits other than 0-9.  As far as I can tell there's no 
good reason for it not to.  Moreover, the standard on which the decimal 
module is based says[1]:

"""It is recommended that implementations also provide additional number 
formatting routines (including some which are locale-dependent), and if 
available should accept non-European decimal digits in strings."""

All other builtin or standard library numeric types already accept such 
digits:

Python 3.2a0 (py3k:74247, Jul 29 2009, 09:28:12) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from fractions import Fraction
>>> from decimal import Decimal
>>> x = '\uff11\uff10\uff15\uff18\uff15'
>>> x
'10585'
>>> int(x)
10585
>>> float(x)
10585.0
>>> complex(x)
(10585+0j)
>>> Fraction(x)
Fraction(10585, 1)
>>> Decimal(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 548, in 
__new__
    "Invalid literal for Decimal: %r" % value)
  File "/Users/dickinsm/python/svn/py3k/Lib/decimal.py", line 3816, in 
_raise_error
    raise error(explanation)
decimal.InvalidOperation: Invalid literal for Decimal: '10585'

I propose adding support for this in Python 3.2 and (possibly) 2.7.  The 
change would be for input only:  no record of the original form of the 
digits would be kept by the Decimal object itself, so that e.g.,
str(Decimal('10585')) would still be '10585'.

[1] See http://speleotrove.com/decimal/daconvs.html
msg91031 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-07-29 09:06
Here's a patch
msg91032 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-07-29 10:07
+1

The standard recommends it, and the other numeric types support it, so
Decimal should as well.
msg91033 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-07-29 10:35
Since you're calling int() on the result, can't this code:
self._int = str(int((intpart+fracpart).lstrip('0') or '0'))
just be:
self._int = str(int(intpart+fracpart))
?

And here, you already know diag is not None, so do you need the "or '0'"
part?
self._int = str(int(diag or '0')).lstrip('0')

And, in both calls to .lstrip('0'), what happens if you have a
non-European leading '0', like '\uff10'?

Otherwise, the patch looks good to me.
msg91051 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-07-29 17:22
+1

Also, I would like to see this backported.  We've long promised that any
variance with the spec will be treated as a bugfix.  The change won't
break any existing code.
msg91092 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-07-30 10:10
Thanks for the feedback;  I've added 2.6, 2.7, 3.1 to the
versions and will backport.

[Eric]
> Since you're calling int() on the result, can't this code:
> self._int = str(int((intpart+fracpart).lstrip('0') or '0'))
> just be:
> self._int = str(int(intpart+fracpart))
> ?

Yes!  Thank you.  I'm not sure what (if anything) I was thinking here.

The str(int(...)) hack is quite an ugly way to normalize a string;  it 
also has the drawback of taking time quadratic in the length of the 
input.  In its defence: (a) I can't think of anything better; (b) this 
change seems to have no noticeable effect on the time to run the test-
suite, and (c) since all the arithmetic operations use string <-> int 
conversions anyway the quadratic time behaviour doesn't really make 
things any worse than they already are.

> And here, you already know diag is not None, so do you need the "or 
'0'"
> part?
> self._int = str(int(diag or '0')).lstrip('0')

I think *something* like this is needed, since diag can legitimately be 
the empty string.  It might be clearer as a proper if block, though:  
'if diag: ...  else: ...'.
msg91133 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-07-31 16:30
I like the str(int(...))  approach because it guarantees handling that
is consistent with other types.
msg91180 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-08-02 10:16
Committed to py3k in r74279, release31-maint in r74280.  Leaving open for 
backport to 2.x.
msg91181 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-08-02 11:03
Backported to trunk and release26-maint in r74281 and r74282.
History
Date User Action Args
2009-08-02 11:03:41mark.dickinsonsetstatus: open -> closed
resolution: fixed
messages: + msg91181
2009-08-02 10:16:42mark.dickinsonsetmessages: + msg91180
2009-07-31 16:30:15rhettingersetmessages: + msg91133
2009-07-30 10:10:34mark.dickinsonsetmessages: + msg91092
versions: + Python 2.6, Python 3.1, Python 2.7
2009-07-29 17:22:10rhettingersetnosy: + rhettinger
messages: + msg91051
2009-07-29 10:35:40eric.smithsetmessages: + msg91033
2009-07-29 10:07:44eric.smithsetnosy: + eric.smith
messages: + msg91032
2009-07-29 09:06:42mark.dickinsonsetfiles: + issue6595.patch
keywords: + patch
messages: + msg91031
2009-07-29 08:39:53mark.dickinsoncreate