This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author victorhooi
Recipients victorhooi
Date 2012-09-25.04:39:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1348547973.67.0.489000428853.issue16034@psf.upfronthosting.co.za>
In-reply-to
Content
Hi,

I was writing a script to parse BZ2 blogfiles under Python 2.6, and I noticed that bz2file (http://pypi.python.org/pypi/bz2file) seemed to perform much slower than with bz2 (native):

http://stackoverflow.com/questions/12575930/is-python-bz2file-slower-than-bz2

I wrote a dummy script that basically just reads through the file, one for bz2 and one for bz2file (attached):

[vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2.py > /dev/null

real    0m5.170s
user    0m5.009s
sys     0m0.030s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2file.py > /dev/null

real    0m5.245s
user    0m4.979s
sys     0m0.060s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2.py > /dev/null

real    0m0.500s
user    0m0.410s
sys     0m0.030s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2file.py > /dev/null

real    0m5.801s
user    0m5.529s
sys     0m0.050s

I also executed "echo 3 > /proc/sys/vm/drop_cache" between each run.

From this, it appears that Python 2.x's bz2 is fast, but bz2file is slow - and that Python 3.x's bz2 is slow.

Obviously, there could be an issue with the methdology above - however, if not, do you know if there are any performance regressions in bz2 from Python 2.x to 3.x?

Thanks,
Victor
History
Date User Action Args
2012-09-25 04:39:33victorhooisetrecipients: + victorhooi
2012-09-25 04:39:33victorhooisetmessageid: <1348547973.67.0.489000428853.issue16034@psf.upfronthosting.co.za>
2012-09-25 04:39:32victorhooilinkissue16034 messages
2012-09-25 04:39:31victorhooicreate