Issue7652
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010-01-07 10:42 by mark.dickinson, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
49433f35a5f8.diff | skrah, 2011-06-18 13:34 | review | ||
bba956250186.diff | skrah, 2011-11-30 21:25 | review | ||
be8a59fcba49.diff | skrah, 2011-12-14 21:11 | fixed all comments except the ones in ISSUES.txt | review | |
ppro-mulmod.txt | skrah, 2011-12-15 14:00 | Proof for x87 FPU modular multiplication | ||
40917e4b51aa.diff | skrah, 2012-02-23 17:30 | review | ||
9b3b1f5c4072.diff | skrah, 2012-03-10 17:58 | review | ||
api-demo.c | skrah, 2012-03-27 18:57 |
Repositories containing patches | |||
---|---|---|---|
http://hg.python.org/features/cdecimal#py3k-cdecimal |
Messages (83) | |||
---|---|---|---|
msg97347 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2010-01-07 10:42 | |
I've created this issue to keep track of progress in merging Stefan Krah's work on decimal in c into py3k. We've created a branch py3k-cdecimal (with merge tracking from py3k) for this work. When the branch is fully working and tested we'll consult python-dev about next steps. |
|||
msg97348 - (view) | Author: Eric V. Smith (eric.smith) * | Date: 2010-01-07 11:40 | |
Is the intention to write Decimal.__format__ in C, too? That would be quite a bit of work, and I'm not sure I could recommend it. But I'm not sure if your plan is to get rid of all Python code or not. If your plan is to rewrite absolutely everything in C, I could help out by exposing the methods that parse format specifiers and do some of the low level formatting. They're used internally by the int, float, and str formatting code. Let me know. |
|||
msg97355 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2010-01-07 13:41 | |
Just to clarify, no decision has yet been made on *whether* the cdecimal work should be integrated into py3k; we'll consult python-dev on this once we've got a working branch and performance information. |
|||
msg97372 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2010-01-07 20:54 | |
To answer Eric's question: Decimal.__format__ is already implemented in Stefan's work---it looks like most of the code is in http://svn.python.org/projects/python/branches/py3k-cdecimal/Modules/cdecimal/io.c (Stefan, is this right?) |
|||
msg97373 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2010-01-07 20:58 | |
So the new branch looks great---thanks, Stefan! I'm only just beginning to look at the code properly, though. A couple of things: (1) Could we unify test_decimal and test_cdecimal somehow? This would avoid them getting out of sync when new tests are added, and would make it clear what the differences between them are. It looks like there's currently a lot of duplicate code. (2) At some point we'll need some documentation. Even if all it says is: the cdecimal module operates identically to the decimal module, with the following exceptions... (notes on threading differences, exponent limits, correct rounding of pow, etc.) |
|||
msg97377 - (view) | Author: Brian Curtin (brian.curtin) * | Date: 2010-01-07 21:35 | |
>mark> (1) Could we unify test_decimal and test_cdecimal somehow? >mark> This would avoid them getting out of sync when new tests are >mark> added, and would make it clear what the differences between >mark> them are. It looks like there's currently a lot of duplicate code. An approach similar to the one taken in test_warnings.py might work: write common test code as a base class, then subclass it to be run against both the C and Py versions of the module. |
|||
msg97382 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-01-07 22:06 | |
Yes, formatting is completely implemented in io.c, together with quite a comprehensive test suite. I like the new Python format strings, so I wanted them in the C library, too. |
|||
msg97411 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-01-08 11:46 | |
Unify test_decimal and test_cdecimal: Yes, quite possible. The diff is currently 400 lines, but it should be easy to get that down to below 100 without any loss of functionality. I'll look into that when I'm done with the 64 bit ANSI path. Documentation: Anything is welcome, even a patch that just creates a stub so I don't have to figure out where to put it. The differences are listed at the bottom of: http://www.bytereef.org/cdecimal.html |
|||
msg97412 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-01-08 11:49 | |
Just an update. Rev.77358 should compile and run stable on the buildbot platforms except Alpha and ia64. I'm working on a default ANSI path for 64-bit. |
|||
msg97719 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-01-13 14:17 | |
As a first step in unifying test_decimal.py and test_cdecimal.py I would like to patch test_decimal.py in trunk and py3k. This is to minimize differences between py3k and py3k-cdecimal. (1) Remove test that Decimal(x) generates a copy. (2) Add test case to formatting test. (3) Extend threading test. (4) Use Emax of 425000000 instead of 999999999 where possible. (The 32-bit version of cdecimal has the official limit of 425000000, even though 999999999 works in almost all cases.) If I get an OK for the two patches, I can commit them to py3k and trunk. If you don't want to apply (1), I'll make new patches. |
|||
msg98161 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-01-22 19:57 | |
All outstanding issues mentioned here have been solved in Rev. 77696: (1) New ANSI path for unknown 64bit platforms (ia64 and Alpha build without problems now). (2) Unified tests for decimal and cdecimal. (3) Documentation for cdecimal. Other improvements: (4) Added comprehensive test suite for testing the library directly. (5) Fixed warnings in Visual Studio. (6) Code formatting. |
|||
msg120305 - (view) | Author: Case Van Horsen (casevh) | Date: 2010-11-03 03:28 | |
Has the cdecimal branch kept up with the hash value changes in 3.2? Is there a still a chance that cdecimal could be merged into 3.2? |
|||
msg120674 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2010-11-07 12:30 | |
On Wed, Nov 3, 2010 at 3:28 AM, Case Van Horsen <report@bugs.python.org> wrote: > Has the cdecimal branch kept up with the hash value changes in 3.2? Not sure; that's a question for Stefan. > Is there a still a chance that cdecimal could be merged into 3.2? A chance, yes; the major need is for someone with time to do a full review. (I'm afraid that's not me, right now.) |
|||
msg121452 - (view) | Author: Stefan Krah (skrah) * | Date: 2010-11-18 12:08 | |
An update on the progress: All development currently happens in my private mpdecimal repository. The next version of mpdecimal (2.0) is finished, stable and will be released once all tests have completed successfully. Running the whole test suite can take several weeks, since Valgrind is involved and all Python versions from 2.5 to 3.2 are tested. In py3k-cdecimal, r86497 is an exact copy of mpdecimal-2.0. All buildbots pass the short tests. Major improvements: o Full compatibility with decimal.py 3.2, including the new hash functions and float operations. o With the new FloatOperation signal, accidental float operations can be detected. o The underlying library - libmpdec - now has 100% code coverage together with Makefile targets for creating coverage reports. In particular, every possible allocation failure during an operation can be tested in brute force style. o The module has 85% code coverage. All lines except failures of Python C-API functions are tested. o Several minor bug fixes, most of them deal with allocation failures under extreme bignum conditions. Potential reviewers: I'll be happy to answer questions here or privately. IMO the best way to get acquainted with the module is to do the regular build and tests first, then explore Lib/test/mpdecimal, reading LIBTEST.txt and PYTEST.txt. |
|||
msg138029 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-06-09 22:34 | |
Just a couple of remarks about the diff I created: The changes to decimal.py are exploratory (i.e. done quite hastily) and serve the purpose to fulfill PEP-399. library/cdecimal.rst is completely out of date. The rest should be very stable. |
|||
msg138583 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-06-18 14:20 | |
The latest patch is based on a relatively stable revision of 3.3. To my knowledge, _decimal.c and decimal.py are now fully compatible in the sense of PEP-399. libmpdec ======== o New test suite with comprehensive tests against decNumber. o Full support for 32-bit compilers (tested with CompCert). PEP-399 ======= o cdecimal.c is now called _decimal.c. Instead of importing cdecimal, _decimal is transparently imported as decimal (if the C version is available). o Unified unit tests with 100% code coverage for both decimal.py and _decimal.c. For _decimal.c, the tests include all failures of Python API functions (requires special patch for testing). o deccheck.py now also tests arbitrary input and makes sure that both modules raise the same exceptions. o Both modules produce the same pickle output for Decimal and Context. _decimal.c ========== o Speed up int/Decimal conversion for integers that fit into a single word of a PyLongObject (performance gain is around 15%). o real(), imag(), conjugate(), __complex__() support. o Fraction and complex comparison support. o Decimal constructor now accepts lists as well as tuples. o DecimalTuple support. o General cleanup and refactoring. The functions for conversions between Decimal and other numeric types are much cleaner now and could be used for a PyDec_* API. |
|||
msg148650 - (view) | Author: STINNER Victor (vstinner) * | Date: 2011-11-30 12:40 | |
> Just to clarify, no decision has yet been made on *whether* > the cdecimal work should be integrated into py3k; > we'll consult python-dev on this once we've got a working branch > and performance information. So, what is the status today? _decimal looks to be huge. Does Python really need yet another multiprecision library? There is already gmpy and bigfloat, based on the heavily optimized GMP library, for example. Is it a license issue? Can't we reuse GMP/MPFR to offer a Decimal API? _decimal should maybe first be distributed as a third party library until it is really well tested and its API is really stable, until we can decide to integrate it. The patch adds __setattr__ to the Decimal class. |
|||
msg148652 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2011-11-30 13:03 | |
> Does Python really need yet another multiprecision library? It's not really another library: it's a reimplementation of the existing decimal library in C. The decimal library is *hugely* valuable to the financial world, but its slowness is a major concern. _decimal would help to address that concern. > Can't we reuse GMP/MPFR to offer a Decimal API? Nope: those are for binary floating-point. Shoehorning decimal semantics on top of a binary floating-point library is a really bad idea. (Actually, that's a part of why decimal.py is slow---it's using Python's *binary* integers to store *decimal* coefficients, so that even simple addition is now a quadratic operation, thanks to the binary <-> decimal conversions involved.) > _decimal should maybe first be distributed as a third party library until it is really well tested and its API is > really stable. My take is that this has already happened. The only problem from my perspective is getting someone to find time to review such a massive patch. I've been wondering whether we could get away with some kind of 'statistical' review: do a large-scale review, and then instead of having someone go through every line of C code, pick a few representative sections at random and review those. If those code portions make it through the review unscathed, declare the code good and merge it in. |
|||
msg148669 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 16:05 | |
Binary versus decimal --------------------- > There is already gmpy and bigfloat, based on the heavily optimized GMP library, > for example. Is it a license issue? Can't we reuse GMP/MPFR to offer a Decimal API? _decimal is a PEP-399 compliant C implementation of decimal.py. The underlying standard is Cowlishaw/IBM's "General Decimal Arithmetic Specification". decimal.py is used for standard-conforming financial calculations. There is no way to implement this in a reasonable manner using a binary floating point library. Additionally, _decimal is also heavily optimized. In fact, for small precisions the module has the same speed as gmpy! Soundness and code size ----------------------- > _decimal should maybe first be distributed as a third party library until > it is really well tested and its API is really stable, until we can decide > to integrate it. Except for a different directory structure, the cdecimal module is identical to this patch. cdecimal has been distributed for almost two years now and has been on pypi for a year. There have been many downloads from financial institutions, stock exchanges and also research institutes. I know for a fact from a private email correspondence that libmpdec is used in a billing application of a large national NIC. cdecimal *appears* to be huge because it has a test suite that actually provides 100% code coverage. Indeed this means that even every possible malloc failure is simulated together with an assertion that the result of the function is (NaN, Malloc_error). The test suite now tests against both decimal.py and decNumber. It has found several small issues in decimal.py, a bug in netlib's dtoa.c, a bug in gmp and a bug in CompCert. The latest tests against decNumber have found 18 issues in decNumber (that I haven't reported yet). In the past 8 months, regression tests for cdecimal-2.3 have been running trillions of test cases both with and without Valgrind. Review ------ The patch could be audited by focusing on basearith.c, cdecimal.c and mpdecimal.c. cdecimal.c is a long but simple wrapper around libmpdec. mpdecimal.c contains all functions of the specification. I contend that for a C programmer mpdecimal.c is not significantly harder to read than decimal.py. The tricky algorithms (newtondiv, invroot, sqrt-via-invroot and ln) have mechanical proofs in ACL2. An initial audit could certainly disregard convolute.c, crt.c, difradix2.c, fnt.c, numbertheory.c, transpose.c and umodarith.h. These are only needed for the number theoretic transform that kicks in at around 22000 digits. Context type safety ------------------- > The patch adds __setattr__ to the Decimal class. Making the context more strictly typed has instantly found a bug in one of decimal.py's docstring tests: # This doctest has always passed: >>> c = Context(ExtendedContext) # But the operation is meaningless: >>> c Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/decimal.py", line 3708, in __repr__ % vars(self)) TypeError: %d format: a number is required, not Context >>> What is the concern about __setattr__? For *setting* contexts, speed is not so important (for reading contexts it is). |
|||
msg148677 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 17:17 | |
Mark Dickinson <report@bugs.python.org> wrote: > The only problem from my perspective is getting someone to find time to review such a massive patch. I've been wondering whether we could get away with some kind of 'statistical' review: do a large-scale review, and then instead of having someone go through every line of C code, pick a few representative sections at random and review those. If those code portions make it through the review unscathed, declare the code good and merge it in. The regex module is in a somewhat similar situation. If I'm interpreting this http://mail.python.org/pipermail/python-dev/2011-August/113240.html dialogue correctly, a complete audit down to the last line isn't always necessary. |
|||
msg148678 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2011-11-30 17:27 | |
> If I'm interpreting this > http://mail.python.org/pipermail/python-dev/2011-August/113240.html > dialogue correctly, a complete audit down to the last line isn't > always necessary. It is also helped by the fact you are a core developer and we trust you to be here to do maintenance :) I'd add that decimal is a fairly specialized module and the implications of a new engine are not as wide-reaching as a new regex engine. Especially if the pure Python version is still there. I think it's still probably a good idea to probe python-dev, if that hasn't already happened. |
|||
msg148680 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2011-11-30 19:10 | |
We've been wanting this for a long time. Strong +1 from me. |
|||
msg148682 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2011-11-30 20:18 | |
I can help with the review. Is http://bugs.python.org/review/7652/show a good starting point? I already have some comments. |
|||
msg148687 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 21:10 | |
Antoine Pitrou <report@bugs.python.org> wrote: > It is also helped by the fact you are a core developer and we trust you to > be here to do maintenance :) Sure. The specification doesn't really change, so the work will hopefully be limited. :) > I think it's still probably a good idea to probe python-dev, if that > hasn't already happened. Yes, I'm planning to do that. |
|||
msg148688 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 21:12 | |
Raymond Hettinger <report@bugs.python.org> wrote: > We've been wanting this for a long time. > > Strong +1 from me. Thank you, Raymond! |
|||
msg148689 - (view) | Author: STINNER Victor (vstinner) * | Date: 2011-11-30 21:13 | |
> (Actually, that's a part of why decimal.py is slow---it's > using Python's *binary* integers to store *decimal* coefficients, > so that even simple addition is now a quadratic operation, > thanks to the binary <-> decimal conversions involved.) Oh, I forgot this minor detail (base 2 vs base 10). |
|||
msg148690 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 21:15 | |
Amaury Forgeot d'Arc <report@bugs.python.org> wrote: > I can help with the review. Is http://bugs.python.org/review/7652/show a > good starting point? I already have some comments. Yes, that would be great. Apart from two or three changes that I still need to push patch set 4 is the latest version. |
|||
msg148691 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-11-30 21:28 | |
Stefan Krah <report@bugs.python.org> wrote: > Yes, that would be great. Apart from two or three changes that I still > need to push patch set 4 is the latest version. Hmm, no. I'll create a slightly newer patch from Oct. 1st. |
|||
msg148721 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-12-01 13:38 | |
[Amaury] > Overall, I think that the "mpd" C library should be better separated from the > _decimal module (a bit like _ctypes, with the libffi library): its own configure > & makefile, its own test suite... which are not necessarily related to Python. Except for its own directory libmpdec has all that (See LIBTEST.txt). Library tests are in the tests/ directory, python tests in the python/ directory. Are you suggesting to build a static library and then use that to build the module? I remember this didn't work on Windows (not to mention AIX and such). |
|||
msg149556 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-12-15 14:00 | |
Amaury has asked for more comments (and I agree). However, I'm not sure what level of detail would be appropriate. As an example, I've posted the full proof of the x87 modular multiplication in umodarith.h. Even with the Coq parts stripped, this would still be a massive comment. Would you prefer that level of detail or should I just post the core of the algorithm? |
|||
msg149557 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-12-15 14:03 | |
Stefan Krah <report@bugs.python.org> wrote: > Would you prefer that level of detail or should I just post the core > of the algorithm? Argh. s/post/add to comments in umodarith.h/ |
|||
msg149559 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2011-12-15 14:12 | |
> Amaury has asked for more comments (and I agree). However, I'm not sure what > level of detail would be appropriate. As an example, I've posted the full > proof of the x87 modular multiplication in umodarith.h. > > > Even with the Coq parts stripped, this would still be a massive comment. You could ship it as a separate .txt file (like we have e.g. Objects/dict_notes.txt). |
|||
msg149567 - (view) | Author: Amaury Forgeot d'Arc (Amaury.Forgeot.d'Arc) * | Date: 2011-12-15 17:33 | |
2011/12/15 Stefan Krah <report@bugs.python.org> > > Stefan Krah <stefan-usenet@bytereef.org> added the comment: > > Amaury has asked for more comments (and I agree). However, I'm not sure > what > level of detail would be appropriate. As an example, I've posted the full > proof of the x87 modular multiplication in umodarith.h. > > > Even with the Coq parts stripped, this would still be a massive comment. > > > Would you prefer that level of detail or should I just post the core > of the algorithm? > For my part, a two-lines description of the purpose of file is enough. Something like "Routines for the reverse transmogrification of randomized digits, used in multiplication of numbers above 2**32 bits" Or something else that makes sense :-) At least something that makes it clear that I don't have to read further if I'm only interested in the definition of Python classes for example. |
|||
msg149600 - (view) | Author: Stefan Krah (skrah) * | Date: 2011-12-16 08:51 | |
> For my part, a two-lines description of the purpose of file is enough. OK, I'll go for small comments in the files themselves and put big ones in separate files. |
|||
msg153447 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-02-15 23:14 | |
I tried my timestamp patch with _decimal: it fails because decimal and _decimal API is not exactly the same. decimal.Decimal.__truediv__() has an optional context argument, whereas _decimal defines PyNumberMethods. decimal.Decimal.quantize() second argument is optional and its default value is None, but if I pass None to _decimal.Decimal.quantize(), I get a TypeError because _decimal expects an integer. |
|||
msg153506 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-02-16 22:25 | |
STINNER Victor <report@bugs.python.org> wrote: > decimal.Decimal.__truediv__() has an optional context argument, whereas _decimal defines PyNumberMethods. Regarding the special methods: decimal.py uses the optional context arguments for convenience so that these methods can be re-used in other places. I wouldn't consider this context argument as part of the API. > decimal.Decimal.quantize() second argument is optional and its default value > is None, but if I pass None to _decimal.Decimal.quantize(), I get a TypeError > because _decimal expects an integer. About this I'm not sure. I think type errors are a courtesy to the user. Look what is possible now in decimal.py: Decimal('9') But here the argument might well be made for accepting None (and only None apart from rounding modes). - I hope Mark and Raymond will give their opinions, too. |
|||
msg153507 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-02-16 22:28 | |
I walked into the Roundup trap again: >>> Decimal(9).quantize(1, "?!?!?") Decimal('9') |
|||
msg154074 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-02-23 17:33 | |
Over the last two months I've done a full review of all files except _decimal.c and mpdecimal.c. I now have additional ACL2 proofs for a couple of functions in basearith.c (some are partial proofs), a full proof for the special case (d = 10**19) of Granlund/Montgomery's "Divide double word by constant" algorithm and a full proof for the Chinese Remainder Theorem in crt.c. I didn't find anything important. I found a couple of useless variable initializations, missing comments etc. There was one (deliberate) incompatibility in the format function: If rounding lead to an Overflow, _decimal printed 'Infinity'. _decimal can't read back numbers that are out of bounds, so this was done to ensure that a roundtrip is always possible. But I changed the function so _decimal now uses exactly the same algorithm for formatting as decimal.py. |
|||
msg154988 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-05 23:15 | |
How can I help to integrate this module into CPython? The test suite pass in debug and release mode without any failure on my Linux box (64 bits, running Ubuntu 11.10). |
|||
msg154992 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2012-03-05 23:47 | |
Victor, yes, the decimal module needs a C implementation. Without it, the pure Python code is abysmally slow. Other MP implementations don't fill the need or come close to implementing the decimal arithmetic spec. |
|||
msg155031 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-06 18:23 | |
(1) I think this module would benefit greatly from a map explaining what each file does, and perhaps from some reorganization. As best I can yet tell, there are about ~130 files, over a dozen directories, but the only ones that directly affect the implementation are a subset (~33) of the *.c and *h files in Modules/_decimal/ (and not subdirectories). Even files that do affect the implementation, such as mpdecimal.c, also seem to have functions thrown in just for testing small pieces of functionality, such as Newton Division. There may also be some code that really isn't needed, except possibly for backwards compatibility, and could be #ifdef'ed or at least commented. For example, the comments above io.c function _mpd_strneq(const char *s, const char *l, const char *u, size_t n) mention working around the Turkish un/dotted-i problem when lowercasing -- but why is a decimal library even worried about casing? (2) Is assembly allowed? If not, please make it clear that vcdiv64.asm is just an optional speedup, and that the code doesn't rely upon it. (3) Are there parts of this library that provide functionality NOT in the current decimal library? If so, this should be at least documented, and perhaps either removed or exposed. |
|||
msg155034 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-06 20:07 | |
Jim, thanks for taking a look at this. Jim Jewett <report@bugs.python.org> wrote: > (1) I think this module would benefit greatly from a map explaining > what each file does, and perhaps from some reorganization. Just MAP.txt in the top level directory? Amaury suggested moving the library into a subdirectory. I'm not sure about that. The library would be out of sight, but is that a good thing? > As best I can yet tell, there are about ~130 files, over a dozen directories, > but the only ones that directly affect the implementation are a subset (~33) > of the *.c and *h files in Modules/_decimal/ (and not subdirectories). Indeed the top level directory contains _decimal.c and all files from libmpdec. Almost all files are required. The three subdirectories contain: tests/ -> library tests python/ -> extended module tests literature/ -> pointers to papers and explanations for trickier algorithms. > Even files that do affect the implementation, such as mpdecimal.c, > also seem to have functions thrown in just for testing small pieces > of functionality, such as Newton Division. That is correct. They were deliberately added in that place because they rely on a couple of inline functions and I have a policy of testing the exact code that the original function relies on. The alternative is to extract all functions needed, move them to the test directory and hope that the code doesn't get stale. But if you have a better idea, I'd be glad to hear it: I don't like the test functions in that place either. The reason that Newton Division is tested for ridiculously small precisions like prec=1 is that it should pass IBM's test suite just like the regular division function. (Also, small precisions are most likely to expose problems). > There may also be some code that really isn't needed, except possibly for > backwards compatibility, and could be #ifdef'ed or at least commented. I'm not aware of any except for whole files: mpsignal.c -> signaling wrappers for the mpdecimal.c functions, not needed for _decimal.c but part of libmpdec. mptest.h -> header for running the tests. bench.c -> library benchmark. > Turkish un/dotted-i problem when lowercasing -- but why is a decimal > library even worried about casing? "Infinity", "InFinItY", "iNF" are all allowed by the specification. > (2) Is assembly allowed? I was under the assumption that it is allowed: Python/pymath.c:23: __asm__ __volatile__ ("fnstcw %0" : "=m" (cw)); Python/pymath.c:28: __asm__ __volatile__ ("fldcw %0" : : "m" (cw)); Python/ceval.c:43: asm volatile ("mftbu %0" : "=r" (tbu) ); Python/ceval.c:44: asm volatile ("mftb %0" : "=r" (tb) ); Python/ceval.c:45: asm volatile ("mftbu %0" : "=r" (tbu2)); Python/ceval.c:59: __asm__ __volatile__("rdtsc" : "=A" (val)) Python/ceval.c:69: __asm__ __volatile__("rdtsc" : \ > If not, please make it clear that vcdiv64.asm is just an optional speedup, > and that the code doesn't rely upon it. No code relies on asm. Assembly is only used for the double word mul/divmod primitives in typearith.h and the Pentium PRO modular multiplication in umodarith.h, and there are ANSI versions for everything. The library really compiles with any compiler I have tested, including compilers without uint64_t like CompCert (CompCert does not compile Python for example, but for other reasons). > (3) Are there parts of this library that provide functionality NOT > in the current decimal library? If so, this should be at least > documented, and perhaps either removed or exposed. Apart from mpsignal.c (see above), there are probably a couple of things in the header files like mpd_invroot(). _mpd_qinvroot() from mpdecimal.c *is* needed because the square root is calculated in terms of the inverse square root. Are these (probably) minor instances of additional functionality a big problem for you? Because for me it would be a hassle to maintain diverging versions of libmpdec and the Python version of libmpdec. This is also related to testing: The complete test suite (all tests against decNumber and decimal.py) under Valgrind takes 8 months to run. My question is: Where should I document these things and in what detail? |
|||
msg155036 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-06 20:14 | |
STINNER Victor <report@bugs.python.org> wrote: > How can I help to integrate this module into CPython? It would be fantastic if you could take a look at _decimal.c, for example to find some incompatibilities between _decimal.c and decimal.py. mpdecimal.c could also always profit from another audit. That's the only file that still needs to go through my second self-audit round. |
|||
msg155045 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2012-03-07 03:49 | |
The scripts for generating code would preferably go in a Tools/decimal directory. |
|||
msg155046 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-07 04:03 | |
On Tue, Mar 6, 2012 at 3:07 PM, Stefan Krah > Jim Jewett <report@bugs.python.org> wrote: >> (1) I think this module would benefit greatly from a map explaining >> what each file does, and perhaps from some reorganization. > Just MAP.txt in the top level directory? That should work. There may be better names, but I can't think of any just now. > Amaury suggested moving > the library into a subdirectory. I'm not sure about that. The > library would be out of sight, but is that a good thing? cdecimal certainly needs a subdirectory similar to those for _io, _ctypes, _multiprocessing, and _sqlite. It *may* make sense to move some of the subdirectories around. (On the other hand, it may not; if the tests in Lib/test/ end up delegating back, that is probably OK.) I believe it would be helpful to move non-code (project files, etc) to separate directories. Whether you need *additional* subdirectories within _cdecimal to subcategorize the .c and .h files, I'm not sure -- because I didn't get in deep enough to know what they should be. If the categorization let people focus on the core, that would be helpful, but it wasn't clear to me which files were part of the exported API and which were implementation details. Are there are clear distinctions (type info/python bindings/basic arithmetic/advanced algorithms/internal-use-only/???) >> As best I can yet tell, there are about ~130 files, over a dozen directories, >> but the only ones that directly affect the implementation are a subset (~33) >> of the *.c and *h files in Modules/_decimal/ (and not subdirectories). > Indeed the top level directory contains _decimal.c and all files > from libmpdec. Almost all files are required. Would it make sense to integrate only cdecimal, and to treat libmpdec as an external dependency that (usually) gets updated with each Python feature release, the way that sqlite is? > The three subdirectories contain: > tests/ -> library tests > python/ -> extended module tests I would really expect that to still be under tests, and I would expect a directory called python to contain code written in python, or at least python bindings. > literature/ -> pointers to papers and explanations for trickier algorithms. >> Even files that do affect the implementation, such as mpdecimal.c, >> also seem to have functions thrown in just for testing small pieces >> of functionality, such as Newton Division. > That is correct. They were deliberately added in that place because they rely > on a couple of inline functions and I have a policy of testing the exact code > that the original function relies on. How important is it that these functions be inline? Would it at least be OK to wrap them in stubs for exporting, so that the test logic could be places with the others tests? (I worry that some tests may stop getting run if someone else modifies the build process and doesn't notice the unusual location.) > The alternative is to extract all functions needed, move them to the test > directory and hope that the code doesn't get stale. I agree that copying is bad. I'll trust your judgement on the need for inline. But given: ALWAYS_INLINE int mpd_word_digits(mpd_uint_t word) I don't see anything wrong with exporting: int _testhelp_mpd_word_digits(mpd_uint_t word) { return mpd_word_digits(word); } >> Turkish un/dotted-i problem when lowercasing -- but why is a decimal >> library even worried about casing? > "Infinity", "InFinItY", "iNF" are all allowed by the specification. OK; so is io.c part of the library, or part of the python binding? Given that this is targeted at 3.3 or later, would it make sense to either use casefolding, or check the kind? (If it isn't ASCII, it can't be the word "INF".) Are there only a certain number of strings that will ever matter, such as INF, NAN, and INFINITY, so that a case statement would work? tolower() with an extra check for the turkish undotted lower case i? What you have may well be the best compromise, but it bothers me to see text processing tools redone in a numeric type -- particularly without knowing why they are needed. >> (2) Is assembly allowed? > I was under the assumption that it is allowed: I'm honestly not sure, but I think that was one of the reasons stackless was never integrated. >> If not, please make it clear that vcdiv64.asm is just an optional speedup, >> and that the code doesn't rely upon it. > No code relies on asm. Assembly is only used for the double word mul/divmod > primitives in typearith.h and the Pentium PRO modular multiplication in > umodarith.h, and there are ANSI versions for everything. Good enough, though I would rather see that as a comment near the assembly. >> (3) Are there parts of this library that provide functionality NOT >> in the current decimal library? If so, this should be at least >> documented, and perhaps either removed or exposed. > Apart from mpsignal.c (see above), there are probably a couple of things > in the header files like mpd_invroot(). _mpd_qinvroot() from mpdecimal.c > *is* needed because the square root is calculated in terms of the > inverse square root. > Are these (probably) minor instances of additional functionality a > big problem for you? Because for me it would be a hassle to maintain > diverging versions of libmpdec and the Python version of libmpdec. I'm not worried about the header files. I am worried about what is exposed to python, but just documenting it (docstrings and the module .rst) may be OK. But I'm also worried that there may be fair amounts of code that are effectively dead after the "remove any names not in decimal.py" importing trick. If so, I would at least like that in some sort of #ifdef, so that people don't spend too much time trying to make sense of it. That said, if you plan to maintain an external libmpdec regardless of what happens, then it makes even more sense to integrate (at most) the bindings, and to treat libmpdec as an external dependency. |
|||
msg155047 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2012-03-07 04:09 | |
Speaking of inline, the "inline" keyword will have to go because it's not C89. |
|||
msg155050 - (view) | Author: Ramchandra Apte (Ramchandra Apte) * | Date: 2012-03-07 06:40 | |
But we could check if the compiler supports the inline keyword and use it if available. |
|||
msg155059 - (view) | Author: Case Van Horsen (casevh) | Date: 2012-03-07 08:30 | |
I've found some differences between decimal and cdecimal. cdecimal 2.3 does not support the __ceil__ and __floor__ methods that exist in decimal. math.ceil converts a cdecimal.Decimal instance into a float before finding the ceiling. This can generate incorrect results. >>> import decimal >>> import math >>> math.ceil(decimal.Decimal("12345678901234567890.1")) 12345678901234567168 The decimal module in previous versions returns the correct answer 12345678901234567891 cdecimal.Decimal instances do not emulate the various single-underscore methods of a decimal.Decimal instance. In gmpy2, I use _int, _exp, _sign, and _is_special to convert a decimal.Decimal into an exact fraction. I realize the issue is with gmpy2 and I will fix gmpy2, but there may be other code that uses those methods. |
|||
msg155070 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 10:28 | |
Jim Jewett <report@bugs.python.org> wrote: > Whether you need *additional* subdirectories within _cdecimal to > subcategorize the .c and .h files, I'm not sure -- because I didn't > get in deep enough to know what they should be. If the categorization > let people focus on the core, that would be helpful, but it wasn't > clear to me which files were part of the exported API and which were > implementation details. Are there are clear distinctions (type > info/python bindings/basic arithmetic/advanced > algorithms/internal-use-only/???) OK, as a basis for discussion I've added: http://hg.python.org/features/cdecimal/file/8b75c2825508/Modules/_decimal/FILEMAP.txt I didn't mention the main reason why _decimal.c and libmpdec are in a flat directory: Building the library first and then the module from the library led to problems on at least Windows and AIX. That's why I started to treat all libmpdec files as part of the module, list them as dependencies in setup.py and let distutils figure everything out. Distutils also can figure out automatically if a Mac OS build happens to be a "universal" build and things like that. The build process is very well tested by now and it took quite a while to figure everything out, so I'd be reluctant to change the flat hierarchy. > > ??python/ ?? ?? ??-> ??extended module tests > > I would really expect that to still be under tests, and I would expect > a directory called python to contain code written in python, or at > least python bindings. Could you explain? The python/ directory contains deccheck.py, formathelper.py etc. > Would it at least be OK to wrap them in stubs for exporting, so that > the test logic could be places with the others tests? (I worry that > some tests may stop getting run if someone else modifies the build > process and doesn't notice the unusual location.) tests/runtest.c won't compile then. I'll look into the stub and also the _testhelp suggestions. > > "Infinity", "InFinItY", "iNF" are all allowed by the specification. > > OK; so is io.c part of the library, or part of the python binding? I see a potential source of confusion: io.c is firmly part of the library. All PEP 3101 formatting is part of libmpdec, because I like the mini language. io.c only understands ASCII and UTF-8 fill characters. It is the *library* tests that would fail under the Turkish locale (if not for _mpd_strneq). > Good enough, though I would rather see that as a comment near the assembly. Comments how to enforce an ANSI build (much slower!) are in LIBTEST.txt and now also in FILEMAP.txt. > I'm not worried about the header files. I am worried about what is > exposed to python, but just documenting it (docstrings and the module > .rst) may be OK. > > But I'm also worried that there may be fair amounts of code that are > effectively dead after the "remove any names not in decimal.py" > importing trick. If so, I would at least like that in some sort of > #ifdef, so that people don't spend too much time trying to make sense > of it. It's the opposite: names from decimal.py starting with an underscore that are not in _decimal are removed. If I don't use that trick, I end up with about 50 additional symbols from decimal.py: import decimal # the C version dir(decimal) ... '_ContextManager', '_Infinity', '_Log10Memoize', ... |
|||
msg155071 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2012-03-07 10:37 | |
Does the C version have a C API importable as capsule ? If not, could you add one and a decimal.h to go with it ? This makes integration in 3rd party modules a lot easier. Thanks, -- Marc-Andre Lemburg eGenix.com ________________________________________________________________________ 2012-02-13: Released eGenix pyOpenSSL 0.13 http://egenix.com/go26 2012-02-09: Released mxODBC.Zope.DA 2.0.2 http://egenix.com/go25 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ |
|||
msg155079 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 12:18 | |
Benjamin Peterson <report@bugs.python.org> wrote: > The scripts for generating code would preferably go in a Tools/decimal directory. Hmm, do you mean the gen*.py scripts? The output is generated by decimal.py and used for testing libmpdec. While the syntax resembles that of the *.decTest files, only tests/runtest can handle the format. |
|||
msg155082 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 12:29 | |
Benjamin Peterson <report@bugs.python.org> wrote: > Speaking of inline, the "inline" keyword will have to go because it's not C89. Do you happen to know a free compiler that builds Python but does not understand "inline"? I'm asking because without testing you can never really be sure: For example I added support for compilers without uint64_t, but all major compilers (gcc, suncc, icc, VS) of course have uint64_t. Then I finally found CompCert, and discovered that a couple of things were missing in the headers. |
|||
msg155083 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2012-03-07 12:43 | |
> Do you happen to know a free compiler that builds Python but does not > understand "inline"? I'm asking because without testing you can never > really be sure: You could use Py_LOCAL_INLINE, but most compilers should inline small functions automatically, AFAIK. |
|||
msg155085 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 13:22 | |
Case Van Horsen <report@bugs.python.org> wrote: > cdecimal 2.3 does not support the __ceil__ and __floor__ Thanks. I'll look into that. > cdecimal.Decimal instances do not emulate the various single-underscore methods of a decimal.Decimal instance. In gmpy2, I use _int, _exp, _sign, and _is_special to convert a decimal.Decimal into an exact fraction. I realize the issue is with gmpy2 and I will fix gmpy2, but there may be other code that uses those methods. There seems to be a real need for getting (sign, coeff, exp). I think psycopg2 uses a painful way to get an integer coefficient via as_tuple(). What should really be added is either as_triple(), which would return (sign, coeff, exp) with an integer coefficient or make these attributes official: Decimal.sign Decimal.coeff Decimal.exp I have to think about implementing Decimal._int etc. Somehow I feel that doing so would send a wrong signal: People shouldn't assume that they'll get away with using private methods and attributes. Also, of course they'll keep using these private methods until they are finally deprecated, and *then* they'll have to change things. |
|||
msg155086 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 13:49 | |
Antoine Pitrou <report@bugs.python.org> wrote: > You could use Py_LOCAL_INLINE, but most compilers should inline small > functions automatically, AFAIK. At the time I wrote it I benchmarked everything. I'm pretty sure that gcc did not inline larger functions like mpd_qresize_zero() and mpd_word_digits() and even some smaller ones that are declared ALWAYS_INLINE. Also, the static inline functions in the header files are absolutely crucial for speed. I recall that Mark initially said that in the Modules hierarchy not every module would need to compile. Now, _decimal is already tested with gcc, clang, icc, suncc, Visual Studio, and success has been reported with xlc[1]. CompCert compiles libmpdec but not Python. _ctypes does not compile with icc and suncc. Unlike _ctypes, _decimal has a fallback in the form of decimal.py. So, perhaps as an alternative we could leave the inlines and wait for build failure reports? [1] compilation success, one of the numerous AIX issues on bugs.python.org with loading the module was encountered IIRC. |
|||
msg155087 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2012-03-07 13:52 | |
> _ctypes does not compile with icc and suncc. Unlike _ctypes, _decimal > has a fallback in the form of decimal.py. So, perhaps as an alternative > we could leave the inlines and wait for build failure reports? Sounds good to me. |
|||
msg155088 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 13:54 | |
Jim Jewett <report@bugs.python.org> wrote: > implementation details. Are there are clear distinctions (type > info/python bindings/basic arithmetic/advanced > algorithms/internal-use-only/???) I failed to mention that libmpdec also has complete documentation. Perhaps that can answer some questions: http://www.bytereef.org/mpdecimal/doc/libmpdec/index.html http://www.bytereef.org/mpdecimal/doc/libmpdec/functions.html#quiet-functions |
|||
msg155093 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 15:32 | |
Marc-Andre Lemburg <report@bugs.python.org> wrote: > Does the C version have a C API importable as capsule ? Not yet. I'll try to make a list with proposed function names and post it here. > If not, could you add one and a decimal.h to go with it ? Yes, sure. |
|||
msg155095 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-07 16:17 | |
This issue was raised by Jim on Rietveld: Currently, the order of arguments in Context.__init__() differs from repr(Context): >>> Context() Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, capitals=1, flags=[], traps=[DivisionByZero, Overflow, InvalidOperation]) >>> >>> Context(28, ROUND_HALF_EVEN, -999999999, 999999999, 1, [], [DivisionByZero, Overflow, InvalidOperation]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.2/decimal.py", line 3774, in __init__ flags = dict([(s, int(s in flags)) for s in _signals]) File "/usr/lib/python3.2/decimal.py", line 3774, in <listcomp> flags = dict([(s, int(s in flags)) for s in _signals]) TypeError: argument of type 'int' is not iterable >>> I find this quite ugly. The repr() order is the good one, since it moves the flag dictionaries to the end. I wanted to change the order in Context.__init__() to match repr(), but I'm not sure if such a change is possible. I don't think Python code would initialize a context without keywords, but C extensions might. |
|||
msg155107 - (view) | Author: Jim Jewett (Jim.Jewett) * | Date: 2012-03-07 19:42 | |
On Wed, Mar 7, 2012 at 5:28 AM, Stefan Krah <stefan-usenet@bytereef.org> added the comment: > OK, as a basis for discussion I've added: > http://hg.python.org/features/cdecimal/file/8b75c2825508/Modules/_decimal/FILEMAP.txt That is indeed very helpful. So helpful that now I understand well enough to have additional gripes. :D Starting from that URL, I don't actually find setup.py. I am assuming (but would prefer to have the file explicitly state) that _decimal.c and docstrings.h are the only source files, and that setup.py would be the only build infrastructure needed if you already had libmpdec.a. I'm not sure what sort of failures building in the normal order led to, but that is certainly something worth documenting, and (ideally) fixing. I didn't see any mention of the literature subdirectory, which makes me wonder if there were other files not listed in the map. (Not yet curious enough to verify for myself, though.) (*.txt files?) I suppose a subdirectory name like "python" makes sense when you look at the library as a C/C++ project that happens to provide python bindings; as part of the python core, it is misleading. It sounds like it should be named "extended_tests" or some such. (Note that this assumes it is strictly for tests; if you are also using it to provide extra functionality, or to generate some of the source code, then I agree with Benjamin that it should move to tools, and the generated code should have clear comments right at the top warning that it is generated.) Within the library, does io.[ch] really limit itself to ASCII? If so, then I don't know why you're worried about the Turkish i. If you mean generic text, then please don't specify ASCII. Within memory.[ch], how much of this configurability is useful (or even available) to a python user, as opposed to an extension writer? Or is it really just for testing? As in, would it be reasonable to just have a single header with a half-dozen #defines, but to replace that header when doing a memory-test build? Or at least to hide the alternative function definitions inside an obvious #ifdef, so that they don't take up memory and attention when they aren't applicable? Under the Bignum section, it mentions that functions from these files are ultimately used in _mpd_fntmul, but doesn't mention where that is (in the main _cdecimal.c) Also, when it talks about "large arrays", could you clarify that it isn't talking about arrays of values or even matrixes, it is just talking about numbers large enough that even representing them takes at least N bytes? >> Would it at least be OK to wrap them in stubs for exporting, so that >> the test logic could be places with the others tests? (I worry that >> some tests may stop getting run if someone else modifies the build >> process and doesn't notice the unusual location.) > tests/runtest.c won't compile then. I'll look into the stub and also > the _testhelp suggestions. OK, let me rephrase. Is newton division exported to users, or used internally, or is it just for testing purposes? _mpd_qtest_newtondiv is documented as a testcase; I would rather see it move to a test file. Why can't it? If it is because of _mpd_qdiv_inf, _settriple, _mpd_qbarrett_divmod, _mpd_real_size (the only apparently internal things it uses), then why are these internal? Could they be exposed at least to test functions? If not, could they at least be wrapped in something that could exposed, such as _testhelp_mpd_qdiv_inf? >> > "Infinity", "InFinItY", "iNF" are all allowed by the specification. >> OK; so is io.c part of the library, or part of the python binding? > I see a potential source of confusion: io.c is firmly part of the > library. [And should therefore be available when used without python?] > All PEP 3101 formatting is part of libmpdec, because I like > the mini language. io.c only understands ASCII and UTF-8 fill characters. > It is the *library* tests that would fail under the Turkish locale > (if not for _mpd_strneq). Are those tests relevant to a _cdecimal built in to python itself? If not, then the comment is at least misleading. Would it make sense to have a directory for files that are used only for the standalone C library, but not when built as part of python? >> Good enough, though I would rather see that as a comment near the assembly. > Comments how to enforce an ANSI build (much slower!) are in LIBTEST.txt > and now also in FILEMAP.txt. I would not have made the leap from that to "What to do if the assembly code needs to be replaced or even just changed." |
|||
msg155135 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-08 00:06 | |
Jim Jewett <report@bugs.python.org> wrote: > > OK, as a basis for discussion I've added: > > http://hg.python.org/features/cdecimal/file/8b75c2825508/Modules/_decimal/FILEMAP.txt > > Starting from that URL, I don't actually find setup.py. It's the setup.py from the Python top level directory. > I'm not sure what sort of failures building in the normal order led > to, but that is certainly something worth documenting, and (ideally) > fixing. I do not have access to AIX. On Windows the failures were locale related when mixing the dynamic and static runtimes. > I didn't see any mention of the literature subdirectory, which makes > me wonder if there were other files not listed in the map. (Not yet > curious enough to verify for myself, though.) (*.txt files?) FILEMAP.txt was intended to get people started, not to be a polished work. > I suppose a subdirectory name like "python" makes sense when you look > at the library as a C/C++ project that happens to provide python > bindings; as part of the python core, it is misleading. OK. > provide extra functionality, or to generate some of the source code, There is no source code generation. I'm not sure where this idea comes from. genlocale.py e.g. has this comment: Usage: ../../../python genlocale.py | ../tests/runtest - > Within the library, does io.[ch] really limit itself to ASCII? If so, > then I don't know why you're worried about the Turkish i. If you mean > generic text, then please don't specify ASCII. I do mean ASCII. Please run this gdb session: diff --git a/Modules/_decimal/io.c b/Modules/_decimal/io.c --- a/Modules/_decimal/io.c +++ b/Modules/_decimal/io.c @@ -245,7 +245,7 @@ if (digits > (size_t)(ctx->prec-ctx->clamp)) goto conversion_error; } - else if (_mpd_strneq(s, "inf", "INF", 3)) { + else if (strncasecmp(s, "inf", 3) == 0) { s += 3; if (*s == '\0' || _mpd_strneq(s, "inity", "INITY", 6)) { /* numeric-value: infinity */ gdb ../../python b mpd_qset_string r >>> locale.setlocale(locale.LC_ALL, 'tr_TR.utf8') 'tr_TR.utf8' >>> from decimal import * >>> Decimal("Infinity") (gdb) 248 else if (strncasecmp(s, "inf", 3) == 0) { (gdb) p s $1 = 0x7ffff7191280 "Infinity" (gdb) p s[0] $2 = 73 'I' (gdb) n 259 if ((coeff = scan_dpoint_exp(s, &dpoint, &exp, &end)) == NULL) Clearly 'I' is ASCII and strncasecmp fails, exactly as written in the comment above _mpd_strneq(). > Within memory.[ch], how much of this configurability is useful (or > even available) to a python user, as opposed to an extension writer? Since it is in the libmpdec section, "User" refers to the extension writer. I can simply drop the "User". > Under the Bignum section, it mentions that functions from these files > are ultimately used in _mpd_fntmul, but doesn't mention where that is > (in the main _cdecimal.c) OK. > Also, when it talks about "large arrays", could you clarify that it > isn't talking about arrays of values or even matrixes, it is just > talking about numbers large enough that even representing them takes > at least N bytes? But it is referring to abstract sequences or arrays of values less than a certain prime. These values happen to be the coefficient of an mpd_t, but you could perform the transform on any sequence. I thought 'Bignum' would already imply an array of machine words representing a number. > >> Would it at least be OK to wrap them in stubs for exporting, so that > >> the test logic could be places with the others tests? ??(I worry that > >> some tests may stop getting run if someone else modifies the build > >> process and doesn't notice the unusual location.) > > > tests/runtest.c won't compile then. I'll look into the stub and also > > the _testhelp suggestions. > > OK, let me rephrase. Is newton division exported to users, or used > internally, or is it just for testing purposes? It's used internally as _mpd_qbarrett_divmod(). When the coefficient has more than MPD_NEWTONDIV_CUTOFF words, the division functions dispatch to _mpd_qbarrett_divmod(). > _mpd_qtest_newtondiv is documented as a testcase; I would rather see > it move to a test file. Why can't it? I said in my last mail that I'll look into it. > >> > "Infinity", "InFinItY", "iNF" are all allowed by the specification. > > >> OK; so is io.c part of the library, or part of the python binding? > > > I see a potential source of confusion: io.c is firmly part of the > > library. > > [And should therefore be available when used without python?] I meant: Despite the fact that io.c might /appear/ to be specifically written for the module because of the presence of PEP 3101 references, it is part of the standalone (moduleless) library. However, _decimal.c uses all parts of io.c except for a couple of functions at the bottom of the file that are useful for debugging. |
|||
msg155331 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-10 18:01 | |
Here's a new patch that addresses several remarks: o _decimal now has __floor__ and __ceil__methods. o libmpdec is now in a separate directory. o I removed the big libmpdec test suite. This is why: - It reduces the number of files that has been criticized multiple times. - Probably I'll be the only one who will be running the test suite. - The test suite is still on PyPI. - The test suite is so large that it's a magnet for toolchain bugs. For example, it found a bug in suncc (acknowledged and fixed by Sun) and a bug in CompCert (acknowledged and fixed by Xavier Leroy). I'm really worried that *if* someone runs the tests and finds an obscure bug, I'll have to spend hours debugging something that might ultimately be an external bug. o Renamed "python" directory into "tests". o Test functions in mpdecimal.c are removed. o All files in libmpdec that aren't required for _decimal are removed. o Extra functionality in _decimal.c is commented out. This does not yet cover the FloatOperation signal, which I would like to add to decimal.py as well (I'll ask later on python-dev). |
|||
msg155359 - (view) | Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * | Date: 2012-03-10 22:57 | |
Please add --with-system-libmpdec (or --with-system-mpdecimal) option in `configure`, similarly to --with-system-expat and --with-system-ffi options. |
|||
msg155381 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-11 11:05 | |
Benjamin Peterson <report@bugs.python.org> wrote: > Speaking of inline, the "inline" keyword will have to go because it's not C89. Actually the trickier instances of "inline" in the .c files are already suppressed when LEGACY_COMPILER (i.e. C89) is defined. I've now listed the machine options here: http://hg.python.org/features/cdecimal/file/0f032cda94aa/Modules/_decimal/README.txt As I now remember, that was in fact necessary for CompCert. The "static inline" instances in header files might not be a problem even for embedded compilers, see e.g.: http://embeddedgurus.com/barr-code/2011/03/do-inline-function-bodies-belong-in-c-header-files/ IIRC also the Linux kernel uses "static inline" in header files. |
|||
msg155419 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2012-03-11 23:18 | |
FWIW, I think we would be better off if this patch were merged in soon. Waiting until later in the release cycle risks introducing bugs that we won't have time to notice or fix. An early merge lets more people exercise the code. |
|||
msg155644 - (view) | Author: Benjamin Peterson (benjamin.peterson) * | Date: 2012-03-13 18:44 | |
Stefan, please feel free to commit the latest patch. |
|||
msg155649 - (view) | Author: Mark Dickinson (mark.dickinson) * | Date: 2012-03-13 19:34 | |
> FWIW, I think we would be better off if this patch were merged in soon. +1 |
|||
msg155743 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-14 11:19 | |
Mark Dickinson <report@bugs.python.org> wrote: > > FWIW, I think we would be better off if this patch were merged in soon. > > +1 OK, I'm counting three +1 for merging soon. Thanks everyone for the encouragement! I'll then proceed with the merge this weekend or shortly after, with one caveat: My *second* self-review of mpdecimal.c is currently at 17%, but I can as well continue with that if the whole thing is merged. I'll probably replace the caching of ln(10) (not thread-safe, protected by the GIL) with a real thread-safe version. |
|||
msg155744 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-14 11:38 | |
Arfrever Frehtes Taifersar Arahesis <report@bugs.python.org> wrote: > Please add --with-system-libmpdec (or --with-system-mpdecimal) option in > `configure`, similarly to --with-system-expat and --with-system-ffi options. I've added that to the list of issues. However, if there is any change in the behavior of decimal.py that also applies to the library, the libmpdec shipped with Python will be silently updated. I can probably release a matching external libmpdec with every major Python release, but I can't promise to track all minor releases. This would mean that for Python-3.3 the yet unreleased mpdecimal-2.4, hopefully with thread-safe mpd_pow(), mpd_ln() and mpd_log10() should be used. |
|||
msg156402 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-20 12:49 | |
We need to decide what to do with the different limits of the 64-bit and 32-bit versions: MAX_EMAX default context 10**9-1 64-bit 10**18-1 32-bit 425000000 I think it would be annoying to have the values in DefaultContext, ExtendedContext and BasicContext depend on the machine. The best thing might be to use Emax=10**8-1 and Emin=-(10**8-1) throughout. I don't think many applications depend on having Emax=10**9-1. If they do, they'll have to use the 64-bit version. |
|||
msg156480 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-21 08:03 | |
> The best thing might be to use Emax=10**8-1 and Emin=-(10**8-1) throughout. > I don't think many applications depend on having Emax=10**9-1. If they do, > they'll have to use the 64-bit version. 10**6-1 would be another option. The advantage is that it's visually obvious that the default exponents have changed. |
|||
msg156500 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-21 17:27 | |
New changeset 7355550d5357 by Stefan Krah in branch 'default': Issue #7652: Integrate the decimal floating point libmpdec library to speed http://hg.python.org/cpython/rev/7355550d5357 |
|||
msg156671 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-03-23 18:07 | |
New changeset f6d646e30028 by Stefan Krah in branch 'default': Issue #7652: Enable linking of _decimal.so against an installed libmpdec. http://hg.python.org/cpython/rev/f6d646e30028 |
|||
msg156890 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-03-27 01:09 | |
You may now close the issue. |
|||
msg156952 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-03-27 18:57 | |
I have a couple of questions about the proposed capsule C API. In order to be useful, it should be possible to set temporary contexts and pass them to functions. For example, PyDec_Add could look like: PyDec_Add(PyObject *a, PyObject *b, PyObject *context); To stay sane, either access macros or access functions to the context would need to be provided: CTX(workcontext)->prec = 100; PyDec_SetPrec(workcontext, 100); PyDecContext_SetPrec(workcontext, 100); Probably flags would need to be checked at some point: if (CTX(workcontext)->status & MPD_Underflow) if (PyDec_GetStatus(workcontext) & MPD_Underflow) if (PyDecContext_GetStatus(workcontext) & MPD_Underflow) I wonder if users would not be better off if libmpdec functions and access macros were exposed instead. The big advantage is that errors (even malloc errors) propagate in the form of NaNs, so it is not necessary to do repeated error checking. Also, I find things like (status & MPD_Underflow) more readable than the alternatives above. I've attached a short (fake) example to demonstrate what I mean. |
|||
msg164470 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-07-01 10:06 | |
I've opened #15237 for the capsule API. I didn't add everyone to the nosy list, since I suspect it's not of general interest. |
|||
msg165329 - (view) | Author: Roundup Robot (python-dev) | Date: 2012-07-12 19:18 | |
New changeset afdb0e1a9dac by Stefan Krah in branch 'default': Issue #7652: Clean up _mpd_qinvroot() and mark it LIBMPDEC_ONLY. Use the http://hg.python.org/cpython/rev/afdb0e1a9dac |
|||
msg165330 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-07-12 19:40 | |
I switched the algorithm in mpd_qsqrt() to the one from decimal.py. Previously the square root was calculated in terms of 1/invsqrt(x). Curiously enough this scheme _always_ seems to produce exact results when expected, but I don't have a proof. I remember I left this in because the specification gives some leeway with respect to exact results: "Square-root can also be calculated by using the power operation (with a second operand of 0.5). The result in that case will not be exact in most cases, and may not be correctly rounded." Anyway, the algorithm from decimal.py gives the desired guarantees and is also faster. Since we're almost in beta-2 stage, would someone be able to do a post commit review of mpd_qsqrt()? It should be a direct translation from the function in decimal.py. |
|||
msg165339 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2012-07-12 22:17 | |
I compared both implementations, and they are the same. I noticed that on line 7537, the call to mpd_qshiftl() may "goto malloc_error;". I think there is a memory leak in this case, "mpd_del(&c)" and 2 others lines are skipped. |
|||
msg165341 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-07-12 22:40 | |
Thanks, Amaury! -- "goto malloc_error" should not leak, because there's always a jump to the "out" label in line 7563. I use this idiom a lot ever since I saw it in several places in the Linux kernel. Of course it's a matter of taste. |
|||
msg168034 - (view) | Author: STINNER Victor (vstinner) * | Date: 2012-08-12 12:21 | |
Is there some remaining work on this issue? Or can it be closed? |
|||
msg168035 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-08-12 12:52 | |
I'm almost done with my (second) self-review of mpdecimal.c. The only functions missing are all Karatsuba functions and mpd_qpowmod(). If there are any takers, I would be very happy. For the Karatsuba functions you'll probably need Roman Maeder's paper, which I could send to anyone interested. All I can promise is that it's a beautiful formulation of the algorithm. :) |
|||
msg169692 - (view) | Author: Stefan Krah (skrah) * | Date: 2012-09-02 12:17 | |
My review is done. The Karatsuba function is basically a small stack machine and very hard to prove formally as far as I can see. The algorithm is cited in TAOCP and the subdivision is brute force tested for all combinations of coefficient lengths of the two input operands that are used in libmpdec (currently 256 < nwords <= 1024). |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:56 | admin | set | github: 51901 |
2012-09-02 12:18:25 | skrah | set | status: open -> closed |
2012-09-02 12:17:23 | skrah | set | assignee: skrah resolution: fixed messages: + msg169692 stage: patch review -> resolved |
2012-08-12 12:52:28 | skrah | set | messages: + msg168035 |
2012-08-12 12:21:45 | vstinner | set | messages: + msg168034 |
2012-07-12 22:40:17 | skrah | set | messages: + msg165341 |
2012-07-12 22:17:11 | amaury.forgeotdarc | set | messages: + msg165339 |
2012-07-12 19:40:11 | skrah | set | messages: + msg165330 |
2012-07-12 19:18:54 | python-dev | set | messages: + msg165329 |
2012-07-01 10:06:19 | skrah | set | messages: + msg164470 |
2012-03-27 18:57:32 | skrah | set | files:
+ api-demo.c messages: + msg156952 |
2012-03-27 01:09:53 | vstinner | set | messages: + msg156890 |
2012-03-23 18:07:28 | python-dev | set | messages: + msg156671 |
2012-03-21 17:27:56 | python-dev | set | nosy:
+ python-dev messages: + msg156500 |
2012-03-21 11:48:28 | asvetlov | set | nosy:
+ asvetlov |
2012-03-21 08:03:30 | skrah | set | messages: + msg156480 |
2012-03-20 12:49:29 | skrah | set | messages: + msg156402 |
2012-03-14 11:38:23 | skrah | set | messages: + msg155744 |
2012-03-14 11:19:15 | skrah | set | messages: + msg155743 |
2012-03-13 19:34:15 | mark.dickinson | set | messages: + msg155649 |
2012-03-13 18:44:42 | benjamin.peterson | set | messages: + msg155644 |
2012-03-11 23:18:52 | rhettinger | set | messages: + msg155419 |
2012-03-11 11:05:20 | skrah | set | messages: + msg155381 |
2012-03-10 22:57:25 | Arfrever | set | nosy:
+ Arfrever messages: + msg155359 |
2012-03-10 18:01:11 | skrah | set | messages: + msg155331 |
2012-03-10 17:59:29 | skrah | set | files: + 9b3b1f5c4072.diff |
2012-03-08 00:06:14 | skrah | set | messages: + msg155135 |
2012-03-07 19:42:16 | Jim.Jewett | set | messages: + msg155107 |
2012-03-07 16:17:44 | skrah | set | messages:
+ msg155095 versions: + Python 3.3, - Python 3.2 |
2012-03-07 15:32:26 | skrah | set | messages: + msg155093 |
2012-03-07 13:54:22 | skrah | set | messages: + msg155088 |
2012-03-07 13:52:06 | pitrou | set | messages: + msg155087 |
2012-03-07 13:49:44 | skrah | set | messages: + msg155086 |
2012-03-07 13:22:59 | skrah | set | messages: + msg155085 |
2012-03-07 12:43:14 | pitrou | set | messages: + msg155083 |
2012-03-07 12:29:10 | skrah | set | messages: + msg155082 |
2012-03-07 12:18:46 | skrah | set | messages: + msg155079 |
2012-03-07 10:37:54 | lemburg | set | nosy:
+ lemburg messages: + msg155071 |
2012-03-07 10:28:57 | skrah | set | messages: + msg155070 |
2012-03-07 08:30:01 | casevh | set | messages: + msg155059 |
2012-03-07 06:40:52 | Ramchandra Apte | set | nosy:
+ Ramchandra Apte messages: + msg155050 |
2012-03-07 04:09:42 | benjamin.peterson | set | messages: + msg155047 |
2012-03-07 04:04:03 | Jim.Jewett | set | messages: + msg155046 |
2012-03-07 03:49:17 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg155045 |
2012-03-06 20:14:52 | skrah | set | messages: + msg155036 |
2012-03-06 20:07:34 | skrah | set | messages: + msg155034 |
2012-03-06 18:36:04 | brian.curtin | set | nosy:
- brian.curtin |
2012-03-06 18:23:11 | Jim.Jewett | set | nosy:
+ Jim.Jewett messages: + msg155031 |
2012-03-05 23:47:19 | rhettinger | set | messages: + msg154992 |
2012-03-05 23:15:54 | vstinner | set | messages: + msg154988 |
2012-02-23 17:33:52 | skrah | set | messages: + msg154074 |
2012-02-23 17:32:29 | skrah | set | files: + 40917e4b51aa.diff |
2012-02-16 22:28:10 | skrah | set | messages: + msg153507 |
2012-02-16 22:25:53 | skrah | set | messages: + msg153506 |
2012-02-15 23:14:23 | vstinner | set | messages: + msg153447 |
2011-12-16 08:51:37 | skrah | set | messages: + msg149600 |
2011-12-15 17:33:09 | Amaury.Forgeot.d'Arc | set | nosy:
+ Amaury.Forgeot.d'Arc messages: + msg149567 |
2011-12-15 14:12:24 | pitrou | set | messages: + msg149559 |
2011-12-15 14:03:28 | skrah | set | messages: + msg149557 |
2011-12-15 14:00:51 | skrah | set | files:
+ ppro-mulmod.txt messages: + msg149556 |
2011-12-14 21:13:56 | skrah | set | files: + be8a59fcba49.diff |
2011-12-01 13:38:13 | skrah | set | messages: + msg148721 |
2011-11-30 21:28:32 | skrah | set | messages: + msg148691 |
2011-11-30 21:26:39 | skrah | set | files: + bba956250186.diff |
2011-11-30 21:15:44 | skrah | set | messages: + msg148690 |
2011-11-30 21:13:43 | vstinner | set | messages: + msg148689 |
2011-11-30 21:12:36 | skrah | set | messages: + msg148688 |
2011-11-30 21:10:27 | skrah | set | messages: + msg148687 |
2011-11-30 20:18:16 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg148682 |
2011-11-30 19:10:01 | rhettinger | set | messages: + msg148680 |
2011-11-30 17:27:11 | pitrou | set | nosy:
+ pitrou messages: + msg148678 |
2011-11-30 17:17:36 | skrah | set | messages: + msg148677 |
2011-11-30 16:05:23 | skrah | set | messages: + msg148669 |
2011-11-30 13:03:29 | mark.dickinson | set | messages: + msg148652 |
2011-11-30 12:40:43 | vstinner | set | messages: + msg148650 |
2011-11-24 23:56:08 | ced | set | nosy:
+ ced |
2011-09-29 19:30:06 | eric.snow | set | nosy:
+ eric.snow |
2011-06-19 22:40:24 | vstinner | set | nosy:
+ vstinner |
2011-06-18 14:20:40 | skrah | set | messages: + msg138583 |
2011-06-18 13:36:17 | skrah | set | files: + 49433f35a5f8.diff |
2011-06-18 13:33:58 | skrah | set | files: - 9a10e3232445.diff |
2011-06-09 22:56:55 | skrah | set | stage: patch review |
2011-06-09 22:34:06 | skrah | set | messages: + msg138029 |
2011-06-09 22:23:30 | skrah | set | files: + 9a10e3232445.diff |
2011-06-09 21:51:39 | skrah | set | files: - ad05c2fdb3b2.diff |
2011-06-09 21:48:24 | skrah | set | files: + ad05c2fdb3b2.diff |
2011-06-09 21:45:27 | skrah | set | hgrepos: + hgrepo25 |
2011-06-09 21:22:19 | skrah | set | files: - trunk_help_unify.diff |
2011-06-09 21:22:15 | skrah | set | files: - py3k_help_unify.diff |
2010-11-18 12:08:26 | skrah | set | messages: + msg121452 |
2010-11-07 12:30:58 | mark.dickinson | set | messages: + msg120674 |
2010-11-03 03:28:06 | casevh | set | nosy:
+ casevh messages: + msg120305 |
2010-01-22 19:57:40 | skrah | set | messages: + msg98161 |
2010-01-13 14:17:29 | skrah | set | files: + trunk_help_unify.diff |
2010-01-13 14:17:02 | skrah | set | files:
+ py3k_help_unify.diff keywords: + patch messages: + msg97719 |
2010-01-08 11:49:58 | skrah | set | messages: + msg97412 |
2010-01-08 11:46:14 | skrah | set | messages: + msg97411 |
2010-01-07 22:06:06 | skrah | set | messages: + msg97382 |
2010-01-07 21:35:45 | brian.curtin | set | nosy:
+ brian.curtin messages: + msg97377 |
2010-01-07 20:58:48 | mark.dickinson | set | messages: + msg97373 |
2010-01-07 20:54:14 | mark.dickinson | set | messages: + msg97372 |
2010-01-07 15:58:21 | jjconti | set | nosy:
+ jjconti |
2010-01-07 13:41:29 | mark.dickinson | set | messages: + msg97355 |
2010-01-07 11:40:47 | eric.smith | set | nosy:
+ eric.smith messages: + msg97348 |
2010-01-07 10:42:50 | mark.dickinson | create |