Issue2039
Created on 2008-02-07 17:19 by christian.heimes, last changed 2008-02-20 12:41 by aimacintyre.
|
msg62158 - (view) |
Author: Christian Heimes (christian.heimes) |
Date: 2008-02-07 17:19 |
|
The patch removes the special allocation schema for ints and floats and
replaces it but a standard PyObject_MALLOC schema with a limited free_list.
|
|
msg62197 - (view) |
Author: Andrew I MacIntyre (aimacintyre) |
Date: 2008-02-08 13:07 |
|
As indicated in a python-dev posting, I'm adding my experimental grade
patches removing the freelists from ints and floats.
Subject to testing on other platforms (I've only tested on FreeBSD 6.1
and OS/2), I suggest that the float case should be seriously considered,
as there seems little advantage to the complexity of the freelist, with
better memory utilisation likely to flow from relying on PyMalloc on top
of being faster than the current freelist implementation (for reasons
unknown; the version in tiran's patch performs similar to the
no-freelist patch).
The int freelist is enough ahead in performance (although only 3-5%) to
justify ignoring the better memory utilisation of dropping the freelist.
|
|
msg62273 - (view) |
Author: Christian Heimes (christian.heimes) |
Date: 2008-02-11 05:56 |
|
The new patch adds a small free list with 80 elements each using a LIFO
implemented as an array of fixed size.
|
|
msg62589 - (view) |
Author: Andrew I MacIntyre (aimacintyre) |
Date: 2008-02-20 12:09 |
|
As noted in a posting to python-dev, I've re-evaluated my test methodology.
The results are as follows, with details of the PyBench runs in the
pybench_summary.txt attachment:
----------------------------------------------------------------------
test trunk no-freelists LIFO(500i,100f)
case 1 case 2 case 1 case 2 case 1 case 2
----------------------------------------------------------------------
pystone 26500 26100 27000 25600 27000 26600
int 1 7.27us 9.09us 6.69us 20.4us 6.64us 9.25us
int 2 10.4us 9.48us 20.9us 20.9us 10.5us 9.69us
int 3 381us 360us 792us 813us 805us 780us
int 4 393us 373us 829us 834us 844us 799us
float 1 1.14ms 1.1ms 1.2ms 1.2ms 1.2ms 1.27ms
float 2 773us 831us 1.05ms 908us 865us 967us
float 3 733us 759us 970us 825us 804us 906us
float 4 74.6us 76.9us 100us 83.7us 77.6us 86.9us
float 5 7.88ms 8.09ms 10.7ms 8.93ms 8.46ms 9.43ms
pybench 16716ms 16666ms 16674ms 16612ms 16612ms 16611ms
script a 30.7s 30.6s 33.0s 33.0s 32.3s 32.6s
script b 41.7s 40.6s 42.1s 39.4s 40.5s 41.8s
----------------------------------------------------------------------
case: 1=std, 2=no small ints
test details
============
pystone:
average of 3 runs
int 1:
./python -m timeit -s "range(1000)" "range(250)"
int 2:
./python -m timeit -s "range(1000)" "range(257,507)"
int 3:
./python -m timeit -s "range(10000)" "range(10000)"
int 4:
./python -m timeit -s "range(11000)" "range(257,10507)"
float 1:
./python -m timeit -s "[float(x) for x in range(1000)]" \
"[float(x) for x in range(1000)]"
float 2:
./python -m timeit -s "map(float, range(1000))" "map(float, range(1000))"
float 3:
./python -m timeit -s "t = range(1000)" "map(float, t)"
float 4:
./python -m timeit -s "t = range(100)" "map(float, t)"
float 5:
./python -m timeit -s "t = range(10000)" "map(float, t)"
pybench:
average runtime per round of ./python Tools/pybench/pybench.py -f <logfile>
script a:
<code>
import time
def b(time_now=time.clock):
limit_val = 2000000
d = [None] * limit_val
start_time = time_now()
for j in xrange(25):
for i in xrange(limit_val):
d[i] = i
for i in d:
d[i] = None
return time_now() - start_time
if __name__ == '__main__':
print 'elapsed: %s s' % b()
</code>
script b:
<code>
import time
def b(time_now=time.clock):
limit_val = 1000000
f = [None] * limit_val
d = range(limit_val)
start_time = time_now()
for j in xrange(25):
for i in d:
f[i] = float(i)
for i in d:
f[i] = None
return time_now() - start_time
if __name__ == '__main__':
print 'elapsed: %s s' % b()
</code>
|
|
msg62590 - (view) |
Author: Andrew I MacIntyre (aimacintyre) |
Date: 2008-02-20 12:23 |
|
My conclusions from the testing I've just reported:
- there are some contradictory results which make little (obvious)
sense, but the testing has been repeated a number of times and nearly
all tests repeat to with 1%;
- leave the int freelist as is, but move the compaction into
gc.collect() as suggested by tiran in a python-dev posting;
- keep the small int cache (it may profitably be increased to cover
a wider range of ints, perhaps -256..1024?? - more testing required);
- the float freelist and float LIFO, while being attractive in
micro-benchmarks, are not useful enough to keep in large scale usage.
This is especially the case when you consider that floats are much less
prevalent than ints in a wide range of Python programs. Serious float
users gravitate to Numpy and other extensions in most cases, and the
simpler memory profile has its own attractions.
|
|
msg62592 - (view) |
Author: Andrew I MacIntyre (aimacintyre) |
Date: 2008-02-20 12:41 |
|
I've realised I could have included tests for a build with the int
freelist but without the float freelist, to justify my conclusions. The
short version: the script tests are almost identical to the baseline
result & most of the other results are between the no-freelist results
and the freelist/LIFO results.
|
|
| Date |
User |
Action |
Args |
| 2008-02-20 12:41:20 | aimacintyre | set | messages:
+ msg62592 |
| 2008-02-20 12:23:14 | aimacintyre | set | messages:
+ msg62590 |
| 2008-02-20 12:09:57 | aimacintyre | set | files:
+ pybench_summary.txt messages:
+ msg62589 |
| 2008-02-11 05:56:47 | christian.heimes | set | files:
+ freelist2.patch messages:
+ msg62273 |
| 2008-02-08 13:07:54 | aimacintyre | set | files:
+ no-intfloat-freelist.patch nosy:
+ aimacintyre messages:
+ msg62197 |
| 2008-02-07 17:19:19 | christian.heimes | create | |
|