Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bz2 module appears slower in Python 3.x versus Python 2.x #60238

Closed
victorhooi mannequin opened this issue Sep 25, 2012 · 18 comments
Closed

bz2 module appears slower in Python 3.x versus Python 2.x #60238

victorhooi mannequin opened this issue Sep 25, 2012 · 18 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@victorhooi
Copy link
Mannequin

victorhooi mannequin commented Sep 25, 2012

BPO 16034
Nosy @jcea, @serhiy-storchaka
Files
  • testbz2file.py
  • bz2bench.py
  • bz2_faster_read.patch
  • bz2_bikeshedding.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-09-30.02:06:23.629>
    created_at = <Date 2012-09-25.04:39:33.087>
    labels = ['library', 'performance']
    title = 'bz2 module appears slower in Python 3.x versus Python 2.x'
    updated_at = <Date 2012-10-21.15:01:02.719>
    user = 'https://bugs.python.org/victorhooi'

    bugs.python.org fields:

    activity = <Date 2012-10-21.15:01:02.719>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-09-30.02:06:23.629>
    closer = 'nadeem.vawda'
    components = ['Library (Lib)']
    creation = <Date 2012-09-25.04:39:33.087>
    creator = 'victorhooi'
    dependencies = []
    files = ['27288', '27311', '27314', '27368']
    hgrepos = []
    issue_num = 16034
    keywords = ['patch']
    message_count = 18.0
    messages = ['171216', '171222', '171223', '171224', '171225', '171231', '171332', '171341', '171601', '171602', '171618', '171628', '171637', '171677', '171680', '171747', '172421', '173453']
    nosy_count = 5.0
    nosy_names = ['jcea', 'nadeem.vawda', 'python-dev', 'serhiy.storchaka', 'victorhooi']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue16034'
    versions = ['Python 3.3', 'Python 3.4']

    @victorhooi
    Copy link
    Mannequin Author

    victorhooi mannequin commented Sep 25, 2012

    Hi,

    I was writing a script to parse BZ2 blogfiles under Python 2.6, and I noticed that bz2file (http://pypi.python.org/pypi/bz2file) seemed to perform much slower than with bz2 (native):

    http://stackoverflow.com/questions/12575930/is-python-bz2file-slower-than-bz2

    I wrote a dummy script that basically just reads through the file, one for bz2 and one for bz2file (attached):

    [vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2.py > /dev/null

    real 0m5.170s
    user 0m5.009s
    sys 0m0.030s
    [vichoo@dev_desktop_vm Desktop]$ time /opt/python3.3/bin/python3.3 testbz2file.py > /dev/null

    real 0m5.245s
    user 0m4.979s
    sys 0m0.060s
    [vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2.py > /dev/null

    real 0m0.500s
    user 0m0.410s
    sys 0m0.030s
    [vichoo@dev_desktop_vm Desktop]$ time /opt/python2.7/bin/python2.7 testbz2file.py > /dev/null

    real 0m5.801s
    user 0m5.529s
    sys 0m0.050s

    I also executed "echo 3 > /proc/sys/vm/drop_cache" between each run.

    From this, it appears that Python 2.x's bz2 is fast, but bz2file is slow - and that Python 3.x's bz2 is slow.

    Obviously, there could be an issue with the methdology above - however, if not, do you know if there are any performance regressions in bz2 from Python 2.x to 3.x?

    Thanks,
    Victor

    @victorhooi victorhooi mannequin added the performance Performance or resource usage label Sep 25, 2012
    @serhiy-storchaka
    Copy link
    Member

    It looks as bz2 in Python 3.3 has bad buffering. Reading by larger chunks shows the same speed in 2.7 and 3.3.

    @serhiy-storchaka serhiy-storchaka added the stdlib Python modules in the Lib dir label Sep 25, 2012
    @serhiy-storchaka
    Copy link
    Member

    Well, I was able to restore performance (using same code as in zipfile). The patch will be later.

    @victorhooi
    Copy link
    Mannequin Author

    victorhooi mannequin commented Sep 25, 2012

    Hi,

    I didn't have any buffering size set before, so I believe it defaults to 0 (no buffering), right? Wouldn't this be the behaviour on both 2.x and 3.x?

    I'm using a 1.5 Mb bzip2 file - I just tried setting buffering to 1000 and 1000000, and it didn't seem to make any noticeable difference to the speed of reading in the file. E.g.:

        f = bz2.BZ2File(filename, 'rb', buffering=1000000)

    What sort of values did you use in relation to your compressed file size to get the improvements?

    Cheers,
    Victor

    @victorhooi
    Copy link
    Mannequin Author

    victorhooi mannequin commented Sep 25, 2012

    Hi,

    Aha, whoops, sorry Serhiy, didn't see your second message - I think you and I posted our last messages at nearly the same time...

    Cool, looking forward to your patch =).

    Also, is there any chance you could provide a more detailed explanation of what's going on? This is just me being curious about how it all works under the hood...

    Cheers,
    Victor

    @serhiy-storchaka
    Copy link
    Member

    Cool, looking forward to your patch =).

    It will take some time to make a completed patch. I don't have much time *right* now. Wait for a few hours.

    Also, is there any chance you could provide a more detailed explanation of
    what's going on? This is just me being curious about how it all works
    under the hood...

    When reading from the buffer bz2 does:

      result = buffer[:size]
      buffer = buffer[size:]  # copy a thousands bytes

    zipfile does:

      result = buffer[offset:offset+size]
      offset += size  # buffer untouched

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch and benchmark script. This required more time than I thought.

    Benchmark results:

    Unpatched:

    5.3 read(1)
    0.5 read(10)
    0.049 read(100)
    0.013 read(1000)
    0.009 read(10000)
    0.0085 read(100000)
    0.0082 read()
    5 read1(1)
    0.47 read1(10)
    0.046 read1(100)
    0.012 read1(1000)
    0.0089 read1(10000)
    0.0084 read1(100000)
    0.0082 read1()
    0.15 readline()

    Patched:

    0.73 read(1)
    0.082 read(10)
    0.015 read(100)
    0.0089 read(1000)
    0.0082 read(10000)
    0.0084 read(100000)
    0.0083 read()
    0.78 read1(1)
    0.087 read1(10)
    0.016 read1(100)
    0.0089 read1(1000)
    0.0082 read1(10000)
    0.0082 read1(100000)
    0.008 read1()
    0.14 readline()

    @serhiy-storchaka
    Copy link
    Member

    Patch updated. Fixed one error. Now readline() optimized too.

    Benchmark results (reading python.bz2):

    Py2.7 Py3.2 Py3.3 Py3.3
    vanilla patched
    4.8 4.8 - 31 read(1)
    1 0.94 3.4e+02 3.6 read(10)
    0.61 0.6 28 0.87 read(100)
    0.58 0.58 3.4 0.61 read(1000)
    0.59 0.57 0.88 0.58 read(10000)
    0.57 0.56 0.62 0.58 read(100000)
    0.57 0.57 0.59 0.58 read()

    •   -       -       30      read1(1)
      
    •   -       3.2e+02 3.6     read1(10)
      
    •   -       27      0.88    read1(100)
      
    •   -       3.3     0.61    read1(1000)
      
    •   -       0.88    0.58    read1(10000)
      
    •   -       0.61    0.57    read1(100000)
      
    •   -       0.58    0.57    read1()
      

    1.7 1.7 11 0.67 readline()

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 30, 2012

    New changeset 1a08f4887cff by Nadeem Vawda in branch '3.3':
    Issue bpo-16034: Fix performance regressions in the new BZ2File implementation.
    http://hg.python.org/cpython/rev/1a08f4887cff

    New changeset cf50a352fe22 by Nadeem Vawda in branch 'default':
    Merge bpo-16034: Fix performance regressions in the new BZ2File implementation.
    http://hg.python.org/cpython/rev/cf50a352fe22

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Sep 30, 2012

    Thanks for the bug report, Victor, and thank you Serhiy for the patch!

    Serhiy, would you be OK with me also including this patch in the bz2file package?

    @nadeemvawda nadeemvawda mannequin closed this as completed Sep 30, 2012
    @serhiy-storchaka
    Copy link
    Member

    Serhiy, would you be OK with me also including this patch in the bz2file package?

    Yes, of course. We can even speed up 1.5 times the reading of small chunks, if we inline _check_can_read() and _read_block().

    The same approach is applied for LZMAFile.

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Sep 30, 2012

    Yes, of course.

    Awesome. I plan to do a new release for this in the next couple of days.

    We can even speed up 1.5 times the reading of small chunks, if we inline _check_can_read() and _read_block().

    Interesting idea, but I don't think it's worthwhile. It looks like this is only a noticeable improvement if size is 10 or 1, and I don't think these are common cases (especially not for users who care about performance). Also, I'm reluctant to have two copies of the code for _read_block(); it makes the code harder to read, and increases the chance of introducing a bug when changing the code.

    The same approach is applied for LZMAFile.

    Of course. I'll apply these optimizations to LZMAFile next weekend.

    @serhiy-storchaka
    Copy link
    Member

    Also, I'm reluctant to have two copies of the code for _read_block(); it makes the code harder to read, and increases the chance of introducing a bug when changing the code.

    Recursive inline _check_can_read() will be enough. Now this check calls 4 Python functions (_check_can_read(), readable(), _check_non_closed(), closed). Recursive inlining only readable() in _check_can_read() is achieved significant but less (about 30%) effect.

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Sep 30, 2012

    Recursive inline _check_can_read() will be enough. Now this check calls 4 Python functions (_check_can_read(), readable(), _check_non_closed(), closed). Recursive inlining only readable() in _check_can_read() is achieved significant but less (about 30%) effect.

    I've inlined readable() into _check_can_read() [3.3: 4258248a44c7 | default: abb5c5bde872]. This seems like a good balance between maximizing our performance in edge cases and not turning the code into a mess in the process ;)

    Once again, thanks for your contributions!

    @serhiy-storchaka
    Copy link
    Member

    In fact I have tried other code, a bit faster and more maintainable (see patch).

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Oct 1, 2012

    Ah, nice - I didn't think of that optimization. Neater and faster.

    I've committed this patch [e6d872b61c57], along with a minor bugfix [7252f9f95fe6], and another optimization for readline()/readlines() [6d7bf512e0c3]. [merge with default: a19f47d380d2]

    If you're wondering why the Roundup Robot didn't update the issue automatically, it's because I made a typo in each of the commit messages. Apparently 16304 isn't the same as 16034. Who would have thought it? :P

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Oct 8, 2012

    I've released v0.95 of bz2file, which incorporates all the optimizations discussed here. The performance should be similar to 2.x's bz2 in most cases.

    It is still a lot slower when calling read(10) or read(1), but I hope no-one is doing that anywhere where performance is important ;-)

    One other note: bz2file's readline() is faster when running on 3.x than on 2.x (and in some cases faster than the 2.x stdlib version). This is probably due to improvements made to io.BufferedIOBase.readline() since 2.7, but I haven't had a chance to investigate this.

    Let me know if you have any issues with the new release.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 21, 2012

    New changeset cc02eca14526 by Nadeem Vawda in branch 'default':
    Issue bpo-16034 follow-up: Apply optimizations to the lzma module.
    http://hg.python.org/cpython/rev/cc02eca14526

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant