classification
Title: Macros for PyLong: sign, number of digits, fits in an int
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: christian.heimes, gregory.p.smith, haypo, mark.dickinson, pernici
Priority: normal Keywords: patch

Created on 2008-11-10 13:12 by haypo, last changed 2010-01-09 18:57 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
pylong_macros-3.patch haypo, 2009-03-26 23:57
Messages (23)
msg75694 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 13:12
It's hard to read Objects/longobject.c because the code depends to 
much on the implementation. I wrote macros to simplify the code:
 - PyLong_SIGN(x)
 - PyLong_EQUALS_ZERO(x)
 - PyLong_FITS_INT(x)
 - PyLong_GET_INT(x)
 - PyLong_NDIGITS(x)

Macros are declared in Include/longintrepr.h to be able to use them 
outside longobject.c. Eg. I used it in Modules/mathmodule.c.

The patch also contains extra changes:
 - create sdigit type (used by PyLong_GET_INT(x))
 - use memcpy() in _PyLong_Copy()
 - use stwodigits in long_mul()
 - replace many Py_SIZE() hacks by my macros (eg. see long_hash)

Using my patch, it will be easier to change long integer 
implementation, like:
 - Use 30-bit digits instead of 15-bit digits (issue #4258)
 - Use GMP for PyLong (issue #1814)

Example of changes:
	i = Py_SIZE(v);
	x = 0;
	if (i < 0) {
		PyErr_SetString(PyExc_OverflowError,
			   "can't convert negative value to unsigned 
int");
		return (unsigned long) -1;
	}
	switch (i) {
	case 0: return 0;
	case 1: return v->ob_digit[0];
	}
	while (--i >= 0) {
        ...
is replaced by:
	if (PyLong_SIGN(v) < 0) {
		PyErr_SetString(PyExc_OverflowError,
			   "can't convert negative value to unsigned 
int");
		return (unsigned long) -1;
	}
	if (PyLong_FITS_INT(v))
		return (unsigned long) PyLong_GET_INT(v);
	x = 0;
	i = PyLong_NDIGITS(v);
	while (--i >= 0) {
        ...
msg75696 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-10 13:37
Interesting.

Incidentally, I'm already using sdigit for the signed version of digit---
this seems to fit with the current digit/twodigits/stwodigits typedefs.  

What's the purpose of your sdigit?  Do you really want it to be type 
'int'?
msg75697 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-10 13:48
Another thought:  macros that are going to be used elsewhere in Python 
(like the way you're using PyLong_SIGN in mathmodule.c) would probably
be better off in longobject.h.   The fewer places there are that have to 
include longintrepr.h, the easier it is to mess with the internal 
representation.

It's quite tempting to 'fix' _PyLong_AsScaledDouble to return e as the 
number of bits, rather than the number of digits;  then mathmodule.c 
wouldn't have to include longintrepr.h at all.

(And if marshal.c were also changed, to read and write integers as byte 
strings, we wouldn't need longintrepr.h anywhere any more!)
msg75698 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 13:50
Second patch to optimize some PyLong operations:
 - Write special code for small (a, b) of long_true_div(), 
long_bitwise(), and l_divmod() (used by long_div(), long_mod() and 
long_divmod(), and long_pow())
 - PyLong_FromLong(): don't go to the while() for small *negative* 
integers

TODO: Write special code for small integers of long_rshift() and 
long_lshift()
msg75699 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 14:00
> What's the purpose of your sdigit?

The type sdigit should be able to store a signed digit, so a number in 
[-(2^15-1); 2^15-1]. short may be enough, but I think that the CPU 
prefers (CPU is faster with) an int because it doesn't have to 
truncate the MSB (eg. 32 bits (u)int => 16 bits (u)short). I also 
used "int" because Python was already using "int" (right?)

> would probably be better off in longobject.h

Right.

> It's quite tempting to 'fix' _PyLong_AsScaledDouble to 
> return e as the number of bits, rather than the number of digits

Good idea. But instead of writing a patch of 100.000 lines, I prefer 
to start with a small patch and then add new patches to improve it. It 
allows to apply only some patches but not all. Is 
_PyLong_AsScaledDouble() used by another module than mathmodule.c?

> marshal.c

Same: it's better to write a separated patch.

--

I opened a new issue because the GMP and 30-bits patchs are too big 
and it's not easy to review/commit them. The idea is to do small 
changes to allow bigger changes later.
msg75701 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-10 14:34
> Same: it's better to write a separated patch.

Agreed.

Unfortunately, I think all of this is going to have to wait for Python 
3.0 to be released before we can consider committing anything.  The 
first step is to commit the pure bugfixes (missing casts, etc.).  I 
suppose we *could* do this for 2.6.1/2.7 now, but it seems safer to make 
all the changes in the 2.x and 3.x branches simultaneously.

By the way, I think we can also get rid of wdigit---it can be safely 
replaced by digit everywhere, as far as I can tell.
msg75702 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 15:47
@marketdickinson: As you proposed, here is a patch for 
_PyLong_AsScaledDouble(): stores directly the number of bits in an 
unsigned int instead of the number of digits.
msg75703 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-10 16:31
I like the idea, Victor
msg75704 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 18:55
Other optimization for long_compare(), long_lshift() and 
long_rshift().

Note: with all my patches, Python is a little bit faster (but not 
slower, which is already a good thing ;-)).
msg75710 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-10 22:35
More benchmark with py3k trunk on a Quad Core@2.5 GHz (64 bits):

./python Lib/test/pystone.py 250000 (maximum value on 5 runs):
 - original: 58685.4 pystones/second
 - patched: 61274.5 pystones/second

Don't trust pystones, results are variables :-/
msg75713 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2008-11-10 23:40
There are so many things going on here it's getting difficult to keep 
track of them all!  Not only Victor's stuff, but also Mario's patches 
for faster multiplication, and for subquadratic division.  And it might 
be interesting to experiment with subquadratic int <-> str conversion as 
well.

I'm wondering whether it would be worth starting a new 
"py3k_long_optimization" branch, so that we can get some of the obvious 
stuff in and start experimenting properly with the rest.

Christian, you're the expert on this:  any thoughts?  Is it possible to 
set up a new svn branch so that it's easy to merge from the py3k branch, 
or is svn merging only really possible from the trunk?  I'm happy to set 
things up and take care of doing regular merges, if you can give me some 
pointers.

(If not, maybe it's time for me to learn how to use git/hg/bzr.  I've 
used darcs quite a bit before, so the concepts aren't totally alien to 
me, but I haven't found time to learn other systems yet.)
msg75714 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-11-11 00:41
It's probably easier to wait a couple of days until 3.0.0 is out. I
don't know how the new branches will look like. 

By the way your work fits nicely in my plans to work on optimizations
for 3.1. Go, Victor, go!
msg75715 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-11 01:38
I'm unable to use pystone nor pybench to compare all integers patches. 
So I wrote my own tool: bench_int.py. Don't use to compare different 
computers or Python versions, it's just useful to test if a patch is 
faster or slower.

Example (still the 64 bits CPU@2.5 GHz):
-------------------------------
original     : 896.5 ms
 + macros    : 891.0 ms (+0.6%)
 ++ optimize : 884.3 ms (+1.4%)
 +++ shift   : 880.8 ms (+1.7%)
GMP          : 700.9 ms (+22%)
30 bits      : 659.9 ms (+26%)
-------------------------------

Result: my optimizations are useless, whereas mark's patch (#4258) is 
26% faster! My GMP patch is only 22% faster (and so slower than the 30 
bits patch). The GMP hack would only be useful for huge value whereas 
my benchmark tool use mostly small values (the biggest is near 
2**200).

I use it with "sync && ./python -OO bench_int.py", run the command 2 
or 3 times, and keep the smallest value.
msg75719 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-11-11 02:16
More numbers!

------ 64 bits CPU -------
original:  885.0 ms
fast long: 746.8 ms (+16%) -- issue #3944 
--------------------------

+16% only with an optimized multiplicaton, great job Pernici :-)

------ 32 bits CPU -------
original: 1564.3 ms
30 bits: 1497.3 ms (+4%)
fast long: 1591.7 ms (-2%)
GMP: 1564.4 ms (=)
--------------------------

It looks like the operation 32 bits x 32 bits is slower on a 32 bits 
CPU, the gain is small (only +4%). fast long and a little bit slower, 
and GMP patch doesn't change anything.
msg75752 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-11 17:37
I agree with Christian that we should wait until 3.0 is out.  Using A
DVCS should make this much easier if anyone needs an excuse to learn bzr.

I think we should aim to commit the 30bit change (a pretty clear win by
the looks of things) after 3.0 is released and py3k branch is unfrozen
and continue work from there to determine what to do with the others.
msg77644 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2008-12-11 23:29
We have now a benchmark tool (attached file "bench_int.py"), many 
patches to try, and 3.0 is released. Anyone interrested to work on it?
msg83787 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-03-18 22:44
Updated version of my macros patch for PyLong type:
 - patch for Python trunk
 - define 3 macros: PyLong_SIGN(x), PyLong_EQUALS_ZERO(x), 
PyLong_NDIGITS(x)
 - just replace code by the equivalent macros

The goal is the make the code easier to read. It would also help if we 
change the PyLong implementation (eg. the 2^30 base patch).

My previous patch was for py3k, the new one is for Python trunk and is 
shorter. I only kept the most simple macros.
msg83967 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-22 10:42
A few comments:

I think PyLong_SIGN and PyLong_EQUALS_ZERO should go in 
Include/longobject.h, not Include/longintrepr.h:  these 2 macros have an 
unambiguous meaning for *any* representation of integers.  And 
longintrepr.h is really supposed to be private, for the use of 
longobject.c only.  PyLong_NDIGITS should stay in longintrepr.h, though, 
since it's dependent on the representation.

Perhaps rename PyLong_EQUALS_ZERO to PyLong_IS_ZERO?

Almost all your uses of PyLong_SIGN take the form PyLong_SIGN(X) < 0.  
Maybe it would be better to have a PyLong_IS_NEGATIVE macro instead?
msg84234 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-03-26 23:57
> I think PyLong_SIGN and PyLong_EQUALS_ZERO should go in
> Include/longobject.h, not Include/longintrepr.h

Yeah, it's better for 3rd party modules.

> PyLong_NDIGITS should stay in longintrepr.h, though,
> since it's dependent on the representation.

I don't understand why. PyLong_SIGN() and PyLong_EQUALS_ZERO() do also 
depend on the PyLong implementation. Eg. if we choose to store zero in 
a PyLong of 1 digit (digits={0}), PyLong_SIGN() have also to be 
changed. I think that PyLong_SIGN() can also be moved to longobject.h 
(not done in my patch v3).

> Perhaps rename PyLong_EQUALS_ZERO to PyLong_IS_ZERO?

Done.

> Almost all your uses of PyLong_SIGN take the form 
> PyLong_SIGN(X) < 0. Maybe it would be better to have a
> PyLong_IS_NEGATIVE macro instead?

Not "almost all", just "all" :-) So I also changed the macro name.
msg84235 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-03-27 00:01
I removed my "optimization" patches: there were useless (no or low 
speedup). I also removed bench_int.py: I moved it to my own public 
SVN:
http://haypo.hachoir.org/trac/browser/misc/bench_int.py
msg84237 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-03-27 00:24
I created a decicated issue for my last unrelated patch:
#5576: Don't use PyLong_SHIFT with _PyLong_AsScaledDouble()
msg84312 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-28 16:36
[Mark]
> PyLong_NDIGITS should stay in longintrepr.h, though,
> since it's dependent on the representation.

[Victor]
>I don't understand why. [...]

I expressed myself badly.  I guess my point was that PyLong_SIGN
and PyLong_EQUALS_ZERO (and PyLong_IS_NEGATIVE) have an obvious
meaning for integers themselves, regardless of the particular
implementation chosen.  But there could be representations of
integers for which PyLong_NDIGITS doesn't even make sense, or
is ambiguous.  Furthermore, if the implementation of long integers
changes then the meanings of PyLong_SIGN and PyLong_EQUALS_ZERO won't
change, but the meaning of PyLong_NDIGITS might well do.

So it seems to me that PyLong_NDIGITS is only really a useful
macro for *this particular* implementation of integers---it's
something that should be internal to Objects/longobject.c and
Include/longintrepr.h, and not exposed to the rest of Python.
msg97159 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-01-03 12:34
Closing this.  I like the PyLong_IS_NEGATIVE and PyLong_IS_ZERO macros, but I can't find anywhere outside longobject.c where they'd actually be useful (with the possible exception of marshal.c, but that currently depends on knowing lots about the long implementation anyway).  And I don't really see much benefit from doing a mass replacement of uses of 'Py_SIZE(a) < 0', etc. in Objects/longobject.c;  I think it's the code is readable enough as it stands.
History
Date User Action Args
2010-01-09 18:57:37mark.dickinsonsetstatus: pending -> closed
2010-01-03 12:34:37mark.dickinsonsetstatus: open -> pending
resolution: rejected
messages: + msg97159
2009-03-28 16:36:21mark.dickinsonsetmessages: + msg84312
2009-03-27 00:24:47hayposetfiles: - pylong_asscaleddouble.patch
2009-03-27 00:24:41hayposetmessages: + msg84237
2009-03-27 00:01:55hayposetmessages: + msg84235
2009-03-27 00:00:58hayposetfiles: - bench_int.py
2009-03-27 00:00:43hayposetfiles: - pylong_shift.patch
2009-03-27 00:00:26hayposetfiles: - pylong_optimize.patch
2009-03-26 23:58:53hayposetfiles: - pylong_macros-2.patch
2009-03-26 23:58:11hayposetfiles: - pylong_macros.patch
2009-03-26 23:57:57hayposetfiles: + pylong_macros-3.patch

messages: + msg84234
2009-03-22 10:44:16mark.dickinsonsetpriority: normal
assignee: mark.dickinson
type: enhancement
components: + Interpreter Core
2009-03-22 10:42:02mark.dickinsonsetmessages: + msg83967
2009-03-18 22:44:39hayposetfiles: + pylong_macros-2.patch

messages: + msg83787
2008-12-11 23:29:13hayposetmessages: + msg77644
2008-11-11 17:37:15gregory.p.smithsetmessages: + msg75752
2008-11-11 02:16:57hayposetnosy: + pernici
messages: + msg75719
2008-11-11 01:38:32hayposetfiles: + bench_int.py
messages: + msg75715
2008-11-11 00:41:04christian.heimessetmessages: + msg75714
2008-11-10 23:40:03mark.dickinsonsetmessages: + msg75713
2008-11-10 22:36:00hayposetmessages: + msg75710
2008-11-10 18:55:26hayposetfiles: + pylong_shift.patch
messages: + msg75704
2008-11-10 16:54:42loewissetversions: + Python 3.1, Python 2.7, - Python 2.6, Python 3.0
2008-11-10 16:31:40christian.heimessetnosy: + christian.heimes
messages: + msg75703
2008-11-10 15:47:17hayposetfiles: + pylong_asscaleddouble.patch
messages: + msg75702
2008-11-10 14:34:04mark.dickinsonsetmessages: + msg75701
2008-11-10 14:00:36hayposetmessages: + msg75699
2008-11-10 13:50:58hayposetfiles: + pylong_optimize.patch
messages: + msg75698
2008-11-10 13:48:47mark.dickinsonsetmessages: + msg75697
2008-11-10 13:37:32mark.dickinsonsetmessages: + msg75696
2008-11-10 13:13:06hayposetnosy: + gregory.p.smith, mark.dickinson
2008-11-10 13:12:32haypocreate