Issue2486
Created on 2008-03-25 21:36 by marketdickinson, last changed 2008-05-04 14:37 by facundobatista.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | Remove |
| decimal_bytes.patch | marketdickinson, 2008-03-26 00:55 | |||
| deccoeff.c | marketdickinson, 2008-05-03 15:35 | Decimal coefficient module (alpha) | ||
| Messages | |||
|---|---|---|---|
| msg64521 (view) | Author: Mark Dickinson (marketdickinson) | Date: 2008-03-25 21:36 | |
The Python 3.0 version of decimal.py currently stores the coefficient of a Decimal number (or the payload of a NaN) as a string, all of whose characters are in the range '0' through '9'. It may be more time-efficient to store the coefficient as a bytes instance instead, since bytes -> int conversion is likely to be faster than str -> int conversion. On the other hand, int -> bytes conversion has to go through str, so may be significantly slower than int -> str conversion... Bottom line: this needs testing. One other option: it may even be worth considering storing the coefficient directly as a Python integer. I've argued against this before, on the basis that it makes direct access to the decimal digits of a number harder, and this direct access is useful for rounding as well as for some of the more esoteric Decimal operations (logical operations, shifts and rotations). But it may be worth a look. (I think it's clear what the *right* option is, given unlimited developer time and energy: decimal.py should represent the coefficient using a long integer implementation written in C, with wordsize a power of 10---probably either 10**4 or 10**9.) See related discussion at issue 2482 and at http://mail.python.org/pipermail/python-dev/2008-March/078189.html |
|||
| msg64531 (view) | Author: Mark Dickinson (marketdickinson) | Date: 2008-03-26 00:55 | |
Here's a first try at a patch that converts the Decimal coefficient from type str to type bytes. It needs some work: here are some timings for a complete run of the test suite (best of 3 in each case) on my MacBook. Python 3.0, str: 12.880s Python 3.0, bytes: 13.294s Python 2.6: 9.279s Python 2.5: 9.212s There's still some room for speedup with the bytes patch, but at the moment it doesn't look as though performance alone is a compelling reason to make this change. |
|||
| msg64538 (view) | Author: Nick Coghlan (ncoghlan) | Date: 2008-03-26 02:30 | |
Hmm, I forgot there wasn't currently a direct path back from int -> bytes. I think your numbers *do* show that we need to do something about the performance of the Python 3.0 version of decimal. A 25% increase in runtime is a pretty big performance hit. |
|||
| msg66155 (view) | Author: Mark Dickinson (marketdickinson) | Date: 2008-05-03 15:35 | |
I've made some progress here: I have a version of decimal.py for
Python 3.0 that uses a Deccoeff instead of a string for the Decimal
coefficient. It still needs a little work to make things efficient,
but it already passes all the tests in test_decimal. There are no
API changes.
What's a Deccoeff? Read on, and/or see attached file.
Deccoeff (for *Dec*imal *coeff*icient) is an extension type for Python
3.0 that combines some of the most useful aspects of both the integer
and string types.
First, you can do arithmetic directly with Deccoeff instances:
>>> from deccoeff import Deccoeff
>>> x = Deccoeff('7')
>>> y = Deccoeff('11')
>>> x + y
Deccoeff('18')
>>> x * y
Deccoeff('77')
>>> y - x
Deccoeff('4')
>>> x < y
True
Only nonnegative integers can be represented as Deccoeff instances,
since that's all that's needed for Decimal: a subtraction that would
produce a negative result instead raises OverflowError:
>>> x-y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: difference is negative and cannot be represented as a Deccoeff
Powering also works:
>>> x ** y
Deccoeff('1977326743')
>>> y ** x
Deccoeff('19487171')
So much for arithmetic: a Deccoeff instance can also be treated as a sequence:
it can be sliced (though the 'step' argument is not supported):
>>> x = Deccoeff('123456789')
>>> x[2:]
Deccoeff('1234567')
>>> x[:5]
Deccoeff('56789')
and indexed:
>>> x[3]
Deccoeff('6')
The length gives the total number of digits:
>>> len(x)
9
Note that the indices work in the opposite direction to the one you
might expect by looking at the string representation of a Deccoeff:
indexing is set up so that index 0 refers to the units digit, 1 to the
tens digit, 2 to the hundreds digit, etc.
Negative indices also make sense (but not in the usual Python sense of
counting from the end of the object): you should imagine the Deccoeff
as a finite string of digits padded with zeros infinitely on both the
left and right. Then it makes sense to do, for example:
>>> x[-2:]
Deccoeff('12345678900')
>>> x[-3:4]
Deccoeff('6789000')
This provides a convenient way to do a 'decimal shift left'. There
are many places in Decimal where it's appropriate to use x[n:] without
knowing in advance whether n is positive or negative.
This is a work in progress... Comments very welcome.
|
|||
| msg66161 (view) | Author: Nick Coghlan (ncoghlan) | Date: 2008-05-03 17:56 | |
(I changed the issue title to better reflect where the discussion has moved to) I really like the approach of a custom internal type for the mantissa storage (the module containing would probably best be called _decimal). Then it should be possible to focus on letting C do what it does best (high speed arithmetic) and Python do what it does best (provide the nice API and context management features) |
|||
| msg66211 (view) | Author: Facundo Batista (facundobatista) | Date: 2008-05-04 14:37 | |
I like this path, also. Beyond Deccoeff, this module could hold in the future other functions that are speed critical. Great work, Mark! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008-05-04 14:37:50 | facundobatista | set | messages: + msg66211 |
| 2008-05-03 17:56:23 | ncoghlan | set | messages:
+ msg66161 title: Consider using bytes type instead of str to store Decimal coefficients -> Decimal slowdown in 3.0 due to str/unicode changes |
| 2008-05-03 15:35:34 | marketdickinson | set | files:
+ deccoeff.c messages: + msg66155 |
| 2008-03-26 02:30:01 | ncoghlan | set | messages: + msg64538 |
| 2008-03-26 00:55:18 | marketdickinson | set | files:
+ decimal_bytes.patch keywords: + patch messages: + msg64531 |
| 2008-03-25 21:36:14 | marketdickinson | create | |