classification
Title: Pymalloc patch for int and float objects
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aimacintyre, christian.heimes (2)
Priority: normal Keywords patch

Created on 2008-02-07 17:19 by christian.heimes, last changed 2008-02-20 12:41 by aimacintyre.

Files
File name Uploaded Description Edit Remove
trunk_intfloat_freelist.patch christian.heimes, 2008-02-07 17:19
no-intfloat-freelist.patch aimacintyre, 2008-02-08 13:07 experimental patch removing freelists from ints and floats
freelist2.patch christian.heimes, 2008-02-11 05:56
pybench_summary.txt aimacintyre, 2008-02-20 12:09 summary of PyBench results for different approaches
Messages (6)
msg62158 - (view) Author: Christian Heimes (christian.heimes) Date: 2008-02-07 17:19
The patch removes the special allocation schema for ints and floats and
replaces it but a standard PyObject_MALLOC schema with a limited free_list.
msg62197 - (view) Author: Andrew I MacIntyre (aimacintyre) Date: 2008-02-08 13:07
As indicated in a python-dev posting, I'm adding my experimental grade
patches removing the freelists from ints and floats.

Subject to testing on other platforms (I've only tested on FreeBSD 6.1
and OS/2), I suggest that the float case should be seriously considered,
as there seems little advantage to the complexity of the freelist, with
better memory utilisation likely to flow from relying on PyMalloc on top
of being faster than the current freelist implementation (for reasons
unknown; the version in tiran's patch performs similar to the
no-freelist patch).

The int freelist is enough ahead in performance (although only 3-5%) to
justify ignoring the better memory utilisation of dropping the freelist.
msg62273 - (view) Author: Christian Heimes (christian.heimes) Date: 2008-02-11 05:56
The new patch adds a small free list with 80 elements each using a LIFO
implemented as an array of fixed size.
msg62589 - (view) Author: Andrew I MacIntyre (aimacintyre) Date: 2008-02-20 12:09
As noted in a posting to python-dev, I've re-evaluated my test methodology.
The results are as follows, with details of the PyBench runs in the 
pybench_summary.txt attachment:

----------------------------------------------------------------------
test        trunk                no-freelists         LIFO(500i,100f)
            case 1    case 2     case 1    case 2     case 1   case 2
----------------------------------------------------------------------
pystone     26500     26100      27000     25600      27000    26600
int 1       7.27us    9.09us     6.69us    20.4us     6.64us   9.25us
int 2       10.4us    9.48us     20.9us    20.9us     10.5us   9.69us
int 3       381us     360us      792us     813us      805us    780us
int 4       393us     373us      829us     834us      844us    799us
float 1     1.14ms    1.1ms      1.2ms     1.2ms      1.2ms    1.27ms
float 2     773us     831us      1.05ms    908us      865us    967us
float 3     733us     759us      970us     825us      804us    906us
float 4     74.6us    76.9us     100us     83.7us     77.6us   86.9us
float 5     7.88ms    8.09ms     10.7ms    8.93ms     8.46ms   9.43ms
pybench     16716ms   16666ms    16674ms   16612ms    16612ms  16611ms
script a    30.7s     30.6s      33.0s     33.0s      32.3s    32.6s
script b    41.7s     40.6s      42.1s     39.4s      40.5s    41.8s
----------------------------------------------------------------------
case: 1=std, 2=no small ints


test details
============

pystone:
 average of 3 runs

int 1:
 ./python -m timeit -s "range(1000)" "range(250)"

int 2:
 ./python -m timeit -s "range(1000)" "range(257,507)"

int 3:
 ./python -m timeit -s "range(10000)" "range(10000)"

int 4:
 ./python -m timeit -s "range(11000)" "range(257,10507)"

float 1:
 ./python -m timeit -s "[float(x) for x in range(1000)]" \
 "[float(x) for x in range(1000)]"

float 2:
 ./python -m timeit -s "map(float, range(1000))" "map(float, range(1000))"

float 3:
 ./python -m timeit -s "t = range(1000)" "map(float, t)"

float 4:
 ./python -m timeit -s "t = range(100)" "map(float, t)"

float 5:
 ./python -m timeit -s "t = range(10000)" "map(float, t)"

pybench:
 average runtime per round of ./python Tools/pybench/pybench.py -f <logfile>

script a:
<code>
import time

def b(time_now=time.clock):
    limit_val = 2000000
    d = [None] * limit_val
    start_time = time_now()
    for j in xrange(25):
        for i in xrange(limit_val):
            d[i] = i
        for i in d:
            d[i] = None
    return time_now() - start_time

if __name__ == '__main__':
    print 'elapsed: %s s' % b()
</code>

script b:
<code>
import time

def b(time_now=time.clock):
    limit_val = 1000000
    f = [None] * limit_val
    d = range(limit_val)
    start_time = time_now()
    for j in xrange(25):
        for i in d:
            f[i] = float(i)
        for i in d:
            f[i] = None
    return time_now() - start_time

if __name__ == '__main__':
    print 'elapsed: %s s' % b()
</code>
msg62590 - (view) Author: Andrew I MacIntyre (aimacintyre) Date: 2008-02-20 12:23
My conclusions from the testing I've just reported:
- there are some contradictory results which make little (obvious)
sense, but the testing has been repeated a number of times and nearly
all tests repeat to with 1%;
- leave the int freelist as is, but move the compaction into
gc.collect() as suggested by tiran in a python-dev posting;
- keep the small int cache (it may profitably be increased to cover 
a wider range of ints, perhaps -256..1024?? - more testing required);
- the float freelist and float LIFO, while being attractive in
micro-benchmarks, are not useful enough to keep in large scale usage. 
This is especially the case when you consider that floats are much less
prevalent than ints in a wide range of Python programs.  Serious float
users gravitate to Numpy and other extensions in most cases, and the
simpler memory profile has its own attractions.
msg62592 - (view) Author: Andrew I MacIntyre (aimacintyre) Date: 2008-02-20 12:41
I've realised I could have included tests for a build with the int
freelist but without the float freelist, to justify my conclusions.  The
short version: the script tests are almost identical to the baseline
result & most of the other results are between the no-freelist results
and the freelist/LIFO results.
History
Date User Action Args
2008-02-20 12:41:20aimacintyresetmessages: + msg62592
2008-02-20 12:23:14aimacintyresetmessages: + msg62590
2008-02-20 12:09:57aimacintyresetfiles: + pybench_summary.txt
messages: + msg62589
2008-02-11 05:56:47christian.heimessetfiles: + freelist2.patch
messages: + msg62273
2008-02-08 13:07:54aimacintyresetfiles: + no-intfloat-freelist.patch
nosy: + aimacintyre
messages: + msg62197
2008-02-07 17:19:19christian.heimescreate