Message 75522 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	christian.heimes, gregory.p.smith, mark.dickinson, vstinner
Date	2008-11-05.11:24:26
SpamBayes Score	1.110223e-16
Marked as misclassified	No
Message-id	<200811051223.55454.victor.stinner@haypocalc.com>
In-reply-to	<1225882491.92.0.261829844236.issue4258@psf.upfronthosting.co.za>

Content
> > And why 30 bits and not 31 bits, or 63 bits, or 120 bits? > > Mostly laziness (...) It was an argument for changing the base used by the mashal :-) > 31 bits would involve rewriting the powering algorithm, which assumes that > PyLong_SHIFT is divisible by 5 Powering is an simple algorithm. If it was the division, it would be much harder :-) Python stores the sign of the number in the first digit. Because of that, we are limited to 15/30 bits. Storing the sign in the size (which size? no idea yet) would allows to use a bigger base (31 bits? 63 bits?). > One problem is again the mismatch between C and assembler: detecting > overflow when adding two 32-bit unsigned integers is trivial in x86 > assembler, but it's not so obvious how to write portable C code that has a > decent chance of being compiled optimally on a wide variety of compilers. I wrote an example to detect overflows in C on the mailing list. Copy of my email: ------------------------------- 8< ---------------------- About 31, 32, 63 or 64 bits: I guess that you want to avoid integer overflow. Intel has an "overflow" flag, changed for all instructions. For other CPUs, you can use emulate this flag. Example with the type int that I used in my GMP patch: Add: int a, b, sum; sum = a + b; exact = ((a < 0) ^ (b < 0)) \|\| ((a < 0) == (sum < 0)); Substract: int a, b, diff; diff = a + b; exact = ((a < 0) == (b < 0)) \|\| ((a < 0) == (diff < 0)); Multiply: int a, b, product; if (a == 0 \|\| b == 0) { product = 0; /* exact / } else if (a != INT_MIN \|\| (b == 1)) { product = a b; exact = (product / b) == a); } else { /* INT_MIN * -1 = -INT_MIN: overflow / } Divide: int a, b, q; if (a != INT_MIN \|\| b != -1) { q = a / b; / exact / } else { / INT_MIN / -1 = -INT_MIN: overflow */ } Checking overflow may costs more than using a smaller base. Only a benchmark can answer ;-) ------------------------------- 8< ---------------------- > I guess my feeling is simply that the 15-bit to 30-bit change seems > incredibly easy and cheap: very little code change, and hence low risk of > accidental breakage. Python has an amazing regression test suite! I used it to fix my GMP patch. We can experiment new bases using this suite. Anyway, i love the idea of using 30 bits instead of 15! Most computer are now 32 or 64 bits! But it's safe to keep the 15 bits version to support older computers or buggy compilers. I started to work with GIT. You may be interrested to work together on GIT. It's much easier to exchanges changeset and play with branches. I will try to publish my GIT tree somewhere.

> > And why 30 bits and not 31 bits, or 63 bits, or 120 bits?
>
> Mostly laziness (...)

It was an argument for changing the base used by the mashal :-)

> 31 bits would involve rewriting the powering algorithm, which assumes that
> PyLong_SHIFT is divisible by 5

Powering is an simple algorithm. If it was the division, it would be much 
harder :-) Python stores the sign of the number in the first digit. Because 
of that, we are limited to 15/30 bits. Storing the sign in the size (which 
size? no idea yet) would allows to use a bigger base (31 bits? 63 bits?).

> One problem is again the mismatch between C and assembler:  detecting
> overflow when adding two 32-bit unsigned integers is trivial in x86
> assembler, but it's not so obvious how to write portable C code that has a
> decent chance of being compiled optimally on a wide variety of compilers.

I wrote an example to detect overflows in C on the mailing list. Copy of my 
email:
------------------------------- 8< ----------------------
About 31, 32, 63 or 64 bits: I guess that you want to avoid integer overflow. 
Intel has an "overflow" flag, changed for all instructions. For other CPUs, 
you can use emulate this flag. Example with the type int that I used in my 
GMP patch:

Add:
  int a, b, sum;
  sum = a + b;
  exact = ((a < 0) ^ (b < 0)) || ((a < 0) == (sum < 0));

Substract:
  int a, b, diff;
  diff = a + b;
  exact = ((a < 0) == (b < 0)) || ((a < 0) == (diff < 0));

Multiply:
  int a, b, product;
  if (a == 0 || b == 0) {
     product = 0;  /* exact */
  } else if (a != INT_MIN || (b == 1)) {
     product = a * b;
     exact = (product / b) == a);
  } else {
     /* INT_MIN * -1 = -INT_MIN: overflow */
  }

Divide:
  int a, b, q;
  if (a != INT_MIN || b != -1) {
     q = a / b;   /* exact */
  } else {
     /* INT_MIN / -1 = -INT_MIN: overflow */
  }

Checking overflow may costs more than using a smaller base. Only a benchmark 
can answer ;-)
------------------------------- 8< ----------------------

> I guess my feeling is simply that the 15-bit to 30-bit change seems
> incredibly easy and cheap: very little code change, and hence low risk of
> accidental breakage.

Python has an amazing regression test suite! I used it to fix my GMP patch. We 
can experiment new bases using this suite.

Anyway, i love the idea of using 30 bits instead of 15! Most computer are now 
32 or 64 bits! But it's safe to keep the 15 bits version to support older 
computers or buggy compilers.

I started to work with GIT. You may be interrested to work together on GIT. 
It's much easier to exchanges changeset and play with branches. I will try to 
publish my GIT tree somewhere.

History
Date	User	Action	Args
2008-11-05 11:27:38	vstinner	set	recipients: + vstinner, gregory.p.smith, mark.dickinson, christian.heimes
2008-11-05 11:24:27	vstinner	link	issue4258 messages
2008-11-05 11:24:26	vstinner	create