classification
Title: bz2 module appears slower in Python 3.x versus Python 2.x
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: jcea, nadeem.vawda, python-dev, serhiy.storchaka, victorhooi
Priority: normal Keywords: patch

Created on 2012-09-25 04:39 by victorhooi, last changed 2012-10-21 15:01 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
testbz2file.py victorhooi, 2012-09-25 04:39
bz2bench.py serhiy.storchaka, 2012-09-26 10:06
bz2_faster_read.patch serhiy.storchaka, 2012-09-26 14:10 review
bz2_bikeshedding.patch serhiy.storchaka, 2012-09-30 22:46 review
Messages (18)
msg171216 - (view) Author: Victor Hooi (victorhooi) Date: 2012-09-25 04:39
Hi,

I was writing a script to parse BZ2 blogfiles under Python 2.6, and I noticed that bz2file (http://pypi.python.org/pypi/bz2file) seemed to perform much slower than with bz2 (native):

http://stackoverflow.com/questions/12575930/is-python-bz2file-slower-than-bz2

I wrote a dummy script that basically just reads through the file, one for bz2 and one for bz2file (attached):

[vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2.py > /dev/null

real    0m5.170s
user    0m5.009s
sys     0m0.030s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2file.py > /dev/null

real    0m5.245s
user    0m4.979s
sys     0m0.060s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2.py > /dev/null

real    0m0.500s
user    0m0.410s
sys     0m0.030s
[vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2file.py > /dev/null

real    0m5.801s
user    0m5.529s
sys     0m0.050s

I also executed "echo 3 > /proc/sys/vm/drop_cache" between each run.

From this, it appears that Python 2.x's bz2 is fast, but bz2file is slow - and that Python 3.x's bz2 is slow.

Obviously, there could be an issue with the methdology above - however, if not, do you know if there are any performance regressions in bz2 from Python 2.x to 3.x?

Thanks,
Victor
msg171222 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-25 05:53
It looks as bz2 in Python 3.3 has bad buffering. Reading by larger chunks shows the same speed in 2.7 and 3.3.
msg171223 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-25 06:18
Well, I was able to restore performance (using same code as in zipfile). The patch will be later.
msg171224 - (view) Author: Victor Hooi (victorhooi) Date: 2012-09-25 06:19
Hi,

I didn't have any buffering size set before, so I believe it defaults to 0 (no buffering), right? Wouldn't this be the behaviour on both 2.x and 3.x?

I'm using a 1.5 Mb bzip2 file - I just tried setting buffering to 1000 and 1000000, and it didn't seem to make any noticeable difference to the speed of reading in the file. E.g.:

    f = bz2.BZ2File(filename, 'rb', buffering=1000000)

What sort of values did you use in relation to your compressed file size to get the improvements?

Cheers,
Victor
msg171225 - (view) Author: Victor Hooi (victorhooi) Date: 2012-09-25 06:23
Hi,

Aha, whoops, sorry Serhiy, didn't see your second message - I think you and I posted our last messages at nearly the same time...

Cool, looking forward to your patch =).

Also, is there any chance you could provide a more detailed explanation of what's going on? This is just me being curious about how it all works under the hood...

Cheers,
Victor
msg171231 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-25 07:37
> Cool, looking forward to your patch =).

It will take some time to make a completed patch. I don't have much time *right* now. Wait for a few hours.

> Also, is there any chance you could provide a more detailed explanation of
> what's going on? This is just me being curious about how it all works
> under the hood...

When reading from the buffer bz2 does:

  result = buffer[:size]
  buffer = buffer[size:]  # copy a thousands bytes

zipfile does:

  result = buffer[offset:offset+size]
  offset += size  # buffer untouched
msg171332 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-26 10:06
Here is a patch and benchmark script. This required more time than I thought.

Benchmark results:

Unpatched:

5.3     read(1)
0.5     read(10)
0.049   read(100)
0.013   read(1000)
0.009   read(10000)
0.0085  read(100000)
0.0082  read()
5       read1(1)
0.47    read1(10)
0.046   read1(100)
0.012   read1(1000)
0.0089  read1(10000)
0.0084  read1(100000)
0.0082  read1()
0.15    readline()

Patched:

0.73    read(1)
0.082   read(10)
0.015   read(100)
0.0089  read(1000)
0.0082  read(10000)
0.0084  read(100000)
0.0083  read()
0.78    read1(1)
0.087   read1(10)
0.016   read1(100)
0.0089  read1(1000)
0.0082  read1(10000)
0.0082  read1(100000)
0.008   read1()
0.14    readline()
msg171341 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-26 14:10
Patch updated. Fixed one error. Now readline() optimized too.

Benchmark results (reading python.bz2):

Py2.7   Py3.2   Py3.3   Py3.3
                vanilla patched
4.8     4.8     -       31      read(1)
1       0.94    3.4e+02 3.6     read(10)
0.61    0.6     28      0.87    read(100)
0.58    0.58    3.4     0.61    read(1000)
0.59    0.57    0.88    0.58    read(10000)
0.57    0.56    0.62    0.58    read(100000)
0.57    0.57    0.59    0.58    read()
-       -       -       30      read1(1)
-       -       3.2e+02 3.6     read1(10)
-       -       27      0.88    read1(100)
-       -       3.3     0.61    read1(1000)
-       -       0.88    0.58    read1(10000)
-       -       0.61    0.57    read1(100000)
-       -       0.58    0.57    read1()
1.7     1.7     11      0.67    readline()
msg171601 - (view) Author: Roundup Robot (python-dev) Date: 2012-09-30 02:02
New changeset 1a08f4887cff by Nadeem Vawda in branch '3.3':
Issue #16034: Fix performance regressions in the new BZ2File implementation.
http://hg.python.org/cpython/rev/1a08f4887cff

New changeset cf50a352fe22 by Nadeem Vawda in branch 'default':
Merge #16034: Fix performance regressions in the new BZ2File implementation.
http://hg.python.org/cpython/rev/cf50a352fe22
msg171602 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-09-30 02:06
Thanks for the bug report, Victor, and thank you Serhiy for the patch!

Serhiy, would you be OK with me also including this patch in the bz2file package?
msg171618 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-30 07:42
> Serhiy, would you be OK with me also including this patch in the bz2file package?

Yes, of course. We can even speed up 1.5 times the reading of small chunks, if we inline _check_can_read() and _read_block().

The same approach is applied for LZMAFile.
msg171628 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-09-30 11:16
> Yes, of course.

Awesome. I plan to do a new release for this in the next couple of days.


> We can even speed up 1.5 times the reading of small chunks, if we inline _check_can_read() and _read_block().

Interesting idea, but I don't think it's worthwhile. It looks like this is only a noticeable improvement if size is 10 or 1, and I don't think these are common cases (especially not for users who care about performance). Also, I'm reluctant to have two copies of the code for _read_block(); it makes the code harder to read, and increases the chance of introducing a bug when changing the code.


> The same approach is applied for LZMAFile.

Of course. I'll apply these optimizations to LZMAFile next weekend.
msg171637 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-30 13:35
> Also, I'm reluctant to have two copies of the code for _read_block(); it makes the code harder to read, and increases the chance of introducing a bug when changing the code.

Recursive inline _check_can_read() will be enough. Now this check calls 4 Python functions (_check_can_read(), readable(), _check_non_closed(), closed). Recursive inlining only readable() in _check_can_read() is achieved significant but less (about 30%) effect.
msg171677 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-09-30 22:17
> Recursive inline _check_can_read() will be enough. Now this check calls 4 Python functions (_check_can_read(), readable(), _check_non_closed(), closed). Recursive inlining only readable() in _check_can_read() is achieved significant but less (about 30%) effect.

I've inlined readable() into _check_can_read() [3.3: 4258248a44c7 | default: abb5c5bde872]. This seems like a good balance between maximizing our performance in edge cases and not turning the code into a mess in the process ;)

Once again, thanks for your contributions!
msg171680 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-09-30 22:46
In fact I have tried other code, a bit faster and more maintainable (see patch).
msg171747 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-01 21:27
Ah, nice - I didn't think of that optimization. Neater and faster.

I've committed this patch [e6d872b61c57], along with a minor bugfix  [7252f9f95fe6], and another optimization for readline()/readlines() [6d7bf512e0c3]. [merge with default: a19f47d380d2]

If you're wondering why the Roundup Robot didn't update the issue automatically, it's because I made a typo in each of the commit messages. Apparently 16304 isn't the same as 16034. Who would have thought it? :P
msg172421 - (view) Author: Nadeem Vawda (nadeem.vawda) * (Python committer) Date: 2012-10-08 22:18
I've released v0.95 of bz2file, which incorporates all the optimizations discussed here. The performance should be similar to 2.x's bz2 in most cases.

It is still a lot slower when calling read(10) or read(1), but I hope no-one is doing that anywhere where performance is important ;-)

One other note: bz2file's readline() is faster when running on 3.x than on 2.x (and in some cases faster than the 2.x stdlib version). This is probably due to improvements made to io.BufferedIOBase.readline() since 2.7, but I haven't had a chance to investigate this.

Let me know if you have any issues with the new release.
msg173453 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-21 15:01
New changeset cc02eca14526 by Nadeem Vawda in branch 'default':
Issue #16034 follow-up: Apply optimizations to the lzma module.
http://hg.python.org/cpython/rev/cc02eca14526
History
Date User Action Args
2012-10-21 15:01:02python-devsetmessages: + msg173453
2012-10-08 22:18:33nadeem.vawdasetmessages: + msg172421
2012-10-01 21:27:07nadeem.vawdasetmessages: + msg171747
2012-09-30 22:46:34serhiy.storchakasetfiles: + bz2_bikeshedding.patch

messages: + msg171680
2012-09-30 22:17:20nadeem.vawdasetmessages: + msg171677
2012-09-30 13:35:32serhiy.storchakasetmessages: + msg171637
2012-09-30 11:16:09nadeem.vawdasetmessages: + msg171628
2012-09-30 07:42:30serhiy.storchakasetmessages: + msg171618
2012-09-30 02:06:23nadeem.vawdasetstatus: open -> closed
versions: + Python 3.4
messages: + msg171602

resolution: fixed
stage: resolved
2012-09-30 02:02:20python-devsetnosy: + python-dev
messages: + msg171601
2012-09-26 14:10:12serhiy.storchakasetfiles: + bz2_faster_read.patch

messages: + msg171341
2012-09-26 14:09:37serhiy.storchakasetfiles: - bz2_faster_read.patch
2012-09-26 11:14:18serhiy.storchakasetfiles: + bz2_faster_read.patch
2012-09-26 11:12:19serhiy.storchakasetfiles: - bz2_faster_read.patch
2012-09-26 10:06:50serhiy.storchakasetfiles: + bz2_faster_read.patch, bz2bench.py
keywords: + patch
messages: + msg171332
2012-09-26 00:07:04jceasetnosy: + jcea
2012-09-25 07:37:53serhiy.storchakasetmessages: + msg171231
2012-09-25 06:23:31victorhooisetmessages: + msg171225
2012-09-25 06:19:41victorhooisetmessages: + msg171224
2012-09-25 06:18:52serhiy.storchakasetmessages: + msg171223
2012-09-25 05:53:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg171222
components: + Library (Lib), - None
2012-09-25 04:57:30ned.deilysetnosy: + nadeem.vawda
2012-09-25 04:39:33victorhooicreate