Message48706
See bug 849046 for history. This patch passes both the
regression test and the standard test. Hopefully the
extra information below won't be too difficult to read.
I can attach this info to the bug, if need be.
Fixed:
- Add self.min_readsize to __init__.
Follows the principal that lines are likely to be
the same length in size,
and doesn't start over at a minimum length string
every call to readline()
- Rewriting of assignment for readsize and size at
the beginning of function.
Eliminates almost all calls to min()
- Change bufs to a string, and not an array. No
point in using an array when
all you do with it is "".join(bufs). Uses string
addition instead.
- Remove extra assignments to bufs (in return())
- Changes readline() to be much more readable (loop
reordering, more comments)
Recommendations:
- Delete _unread() function. It is used _only_ by
readline(), and moving its
functionality into readline() itself saves the
function call overhead.
_unread() is only 3 lines long. Testing shows that
removing it speeds
readline() up by about 3%. Backwards compatibility
concerns?
Testing results:
test_append (__main__.TestGzip) ... ok
test_many_append (__main__.TestGzip) ... ok
test_mode (__main__.TestGzip) ... ok
test_read (__main__.TestGzip) ... ok
test_readline (__main__.TestGzip) ... ok
test_readlines (__main__.TestGzip) ... ok
test_seek_read (__main__.TestGzip) ... ok
test_seek_write (__main__.TestGzip) ... ok
test_write (__main__.TestGzip) ... ok
----------------------------------------------------------------------
Ran 9 tests in 0.331s
Regression tests:
python regrtest.py -g test_gzip.py
test_gzip
1 test OK.
---
Profiling Results (performed on a common compressed log
file - 200748 lines).
With patch...
1213961 function calls in 12.188 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall
filename:lineno(function)
1 0.000 0.000 0.000 0.000 :0(close)
1159 0.020 0.000 0.020 0.000 :0(crc32)
1158 0.100 0.000 0.100 0.000
:0(decompress)
1 0.000 0.000 0.000 0.000
:0(decompressobj)
200774 0.812 0.000 0.812 0.000 :0(find)
403865 0.902 0.000 0.902 0.000 :0(len)
1183 0.000 0.000 0.000 0.000 :0(min)
2 0.000 0.000 0.000 0.000 :0(ord)
1173 0.000 0.000 0.000 0.000 :0(read)
12 0.000 0.000 0.000 0.000 :0(seek)
1 0.000 0.000 0.000 0.000
:0(setprofile)
18 0.000 0.000 0.000 0.000 :0(tell)
2 0.000 0.000 0.000 0.000 :0(unpack)
1 0.000 0.000 12.188 12.188 <string>:1(?)
1 0.000 0.000 0.000 0.000
gzip_new.py:156(_init_read)
1 0.000 0.000 0.000 0.000
gzip_new.py:160(_read_gzip_header)
3 0.000 0.000 0.000 0.000
gzip_new.py:18(U32)
200774 2.453 0.000 2.593 0.000
gzip_new.py:207(read)
200749 2.894 0.000 3.796 0.000
gzip_new.py:239(_unread)
1166 0.010 0.000 0.140 0.000
gzip_new.py:244(_read)
1 0.000 0.000 0.000 0.000
gzip_new.py:27(LOWU32)
1158 0.010 0.000 0.030 0.000
gzip_new.py:294(_add_read_data)
1 0.000 0.000 0.000 0.000
gzip_new.py:300(_read_eof)
1 0.000 0.000 0.000 0.000
gzip_new.py:314(close)
1 0.000 0.000 0.000 0.000
gzip_new.py:327(__del__)
200749 3.916 0.000 11.117 0.000
gzip_new.py:384(readline)
2 0.000 0.000 0.000 0.000
gzip_new.py:39(read32)
1 0.000 0.000 0.000 0.000
gzip_new.py:42(open)
1 0.000 0.000 0.000 0.000
gzip_new.py:60(__init__)
1 0.000 0.000 12.188 12.188
profile:0(gunzip_gzip_new_open())
0 0.000 0.000
profile:0(profiler)
1 1.071 1.071 12.188 12.188
test_gzip_speed.py:14(gunzip_gzip_new_open)
Without patch...
2073328 function calls in 18.597 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall
filename:lineno(function)
243820 0.735 0.000 0.735 0.000 :0(append)
1 0.000 0.000 0.000 0.000 :0(close)
1159 0.040 0.000 0.040 0.000 :0(crc32)
1158 0.100 0.000 0.100 0.000
:0(decompress)
1 0.000 0.000 0.000 0.000
:0(decompressobj)
243820 0.960 0.000 0.960 0.000 :0(find)
200749 0.801 0.000 0.801 0.000 :0(join)
489958 1.330 0.000 1.330 0.000 :0(len)
243820 0.791 0.000 0.791 0.000 :0(min)
2 0.000 0.000 0.000 0.000 :0(ord)
1173 0.030 0.000 0.030 0.000 :0(read)
6 0.000 0.000 0.000 0.000 :0(seek)
1 0.000 0.000 0.000 0.000
:0(setprofile)
6 0.000 0.000 0.000 0.000 :0(tell)
2 0.000 0.000 0.000 0.000 :0(unpack)
1 0.000 0.000 18.597 18.597 <string>:1(?)
1 0.000 0.000 0.000 0.000
gzip.py:154(_init_read)
1 0.000 0.000 0.000 0.000
gzip.py:158(_read_gzip_header)
3 0.000 0.000 0.000 0.000
gzip.py:18(U32)
243820 2.711 0.000 2.921 0.000
gzip.py:205(read)
200749 3.083 0.000 4.143 0.000
gzip.py:237(_unread)
1160 0.010 0.000 0.210 0.000
gzip.py:242(_read)
1 0.000 0.000 0.000 0.000
gzip.py:27(LOWU32)
1158 0.030 0.000 0.070 0.000
gzip.py:292(_add_read_data)
1 0.000 0.000 0.000 0.000
gzip.py:298(_read_eof)
1 0.000 0.000 0.000 0.000
gzip.py:312(close)
1 0.000 0.000 0.000 0.000
gzip.py:325(__del__)
200749 6.934 0.000 17.555 0.000
gzip.py:379(readline)
2 0.000 0.000 0.000 0.000
gzip.py:39(read32)
1 0.000 0.000 0.000 0.000
gzip.py:42(open)
1 0.000 0.000 0.000 0.000
gzip.py:59(__init__)
1 0.000 0.000 18.597 18.597
profile:0(gunzip_gzip_open())
0 0.000 0.000
profile:0(profiler)
1 1.042 1.042 18.597 18.597
test_gzip_speed.py:7(gunzip_gzip_open)
Using popen + gunzip -c...
200754 function calls in 4.338 CPU seconds
Ordered by: standard name
ncalls tottime percall cumtime percall
filename:lineno(function)
1 0.000 0.000 0.000 0.000 :0(popen)
200749 3.578 0.000 3.578 0.000 :0(readline)
1 0.000 0.000 0.000 0.000
:0(setprofile)
1 0.240 0.240 4.338 4.338 <string>:1(?)
1 0.000 0.000 4.338 4.338
profile:0(gunzip_popen())
0 0.000 0.000
profile:0(profiler)
1 0.520 0.520 4.098 4.098
test_gzip_speed.py:21(gunzip_popen) |
|
Date |
User |
Action |
Args |
2007-08-23 15:43:49 | admin | link | issue1281707 messages |
2007-08-23 15:43:49 | admin | create | |
|