Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bytes performance regression in python3.3 vs python3.2 #57832

Closed
Lothiraldan mannequin opened this issue Dec 17, 2011 · 7 comments
Closed

Bytes performance regression in python3.3 vs python3.2 #57832

Lothiraldan mannequin opened this issue Dec 17, 2011 · 7 comments
Labels
performance Performance or resource usage

Comments

@Lothiraldan
Copy link
Mannequin

Lothiraldan mannequin commented Dec 17, 2011

BPO 13623
Nosy @vstinner, @ezio-melotti, @florentx, @Lothiraldan
Files
  • stringbench_log_cpython3.2: Stringbenchmark log for cpython3.2
  • compare.py: Script used to compute diff between two runs
  • stringbench_log_cpython3.3: String benchmark log for cpython3.3
  • bytes_find.patch
  • bytes_find-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-12-18.00:29:18.114>
    created_at = <Date 2011-12-17.18:22:48.932>
    labels = ['performance']
    title = 'Bytes performance regression in python3.3 vs python3.2'
    updated_at = <Date 2011-12-18.00:29:18.113>
    user = 'https://github.com/Lothiraldan'

    bugs.python.org fields:

    activity = <Date 2011-12-18.00:29:18.113>
    actor = 'vstinner'
    assignee = 'collinwinter'
    closed = True
    closed_date = <Date 2011-12-18.00:29:18.114>
    closer = 'vstinner'
    components = ['Benchmarks']
    creation = <Date 2011-12-17.18:22:48.932>
    creator = 'Boris.FELD'
    dependencies = []
    files = ['23999', '24001', '24002', '24010', '24013']
    hgrepos = []
    issue_num = 13623
    keywords = ['patch']
    message_count = 7.0
    messages = ['149689', '149691', '149693', '149718', '149720', '149721', '149725']
    nosy_count = 6.0
    nosy_names = ['collinwinter', 'vstinner', 'ezio.melotti', 'flox', 'Boris.FELD', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue13623'
    versions = ['Python 3.2', 'Python 3.3']

    @Lothiraldan
    Copy link
    Mannequin Author

    Lothiraldan mannequin commented Dec 17, 2011

    Hello everyone, I juste tried to launch the stringbench on python3.2 and python3.3 dev versions and some bytes tests run slower in python3.3 than in python3.2.

    I cc the two raw output of both runs. I also extracted most interesting data (all the tests with more than 20% of performance regression):

    • (b"A"*1000).rfind(b"A") (*1000): -70.103093%
    • (b"A"*1000).find(b"B") (*1000): -48.372093%
    • (b"A"*1000).rindex(b"A") (*1000): -68.888889%
    • s=b"ABC"*33; (s+b"E"+(b"D"+s)*500).rfind(s+b"E") (*100): -28.982301%
    • (b"C"+b"AB"*300).rfind(b"CA") (*1000): -29.565217%
    • (b"AB"*1000).index(b"AB") (*1000): -68.539326%
    • b"Andrew".endswith(b"w") (*1000): -21.212121%
    • (b"A"*1000).index(b"A") (*1000): -71.111111%
    • (b"BC"+b"AB"*300).rfind(b"BC") (*1000): -42.788462%
    • b"Andrew".startswith(b"Andrew") (*1000): -20.588235%
    • (b"AB"*1000).find(b"AB") (*1000): -69.318182%
    • (b"AB"*1000).rfind(b"AB") (*1000): -69.791667%
    • (b"A"*1000).rfind(b"B") (*1000): -37.988827%
    • (b"AB"*300+"C").index(b"BC") (*1000): -28.750000%
    • b"B" in b"A"*1000 (*1000): -24.479167%
    • (b"AB"*300+"CA").find(b"CA") (*1000): -33.673469%
    • (b"AB"*1000).rindex(b"AB") (*1000): -67.777778%
    • (b"C"+"AB"*300).rindex(b"CA") (*1000): -29.017857%
    • (b"AB"*300+"C").find(b"BC") (*1000): -28.451883%
    • b"Andrew".startswith(b"A") (*1000): -21.212121%
    • b"Andrew".startswith(b"Anders") (*1000): -21.212121%
    • (b"A"*1000).partition(b"B") (*1000): -30.656934%
    • (b"AB"*1000).rfind(b"CA") (*1000): -20.603015%
    • (b"AB"*1000).rfind(b"BC") (*1000): -35.645472%
    • (b"A"*1000).find(b"A") (*1000): -70.454545%

    My environment is:
    Mac OS X 10.6.8
    GCC i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
    CPython3.3 revision ea421c534305
    CPython3.2 revision 0b86da9d6964

    @Lothiraldan Lothiraldan mannequin added performance Performance or resource usage labels Dec 17, 2011
    @vstinner
    Copy link
    Member

    Grouped results.

    find (first):

    • (b"A"*1000).find(b"A") : -70%

    • (b"A"*1000).rfind(b"A") : -70%

    • (b"A"*1000).index(b"A") : -71%

    • (b"A"*1000).rindex(b"A") : -68%

    • (b"AB"*1000).index(b"AB") : -68%

    • (b"AB"*1000).rindex(b"AB"): -67%

    • (b"AB"*1000).find(b"AB") : -69%

    • (b"AB"*1000).rfind(b"AB") : -69%

    • b"Andrew".startswith(b"Andrew"): -20%

    • b"Andrew".startswith(b"A") : -21%

    • b"Andrew".startswith(b"Anders"): -21%

    • b"Andrew".endswith(b"w"): -21%

    find (last):

    • (b"AB"*300+"CA").find(b"CA") : -33%
    • (b"C"+"AB"*300).rindex(b"CA") : -29%
    • (b"AB"*300+"C").find(b"BC") : -28%
    • (b"AB"*300+"C").index(b"BC") : -28%
    • (b"C"+b"AB"*300).rfind(b"CA") : -29%
    • (b"BC"+b"AB"*300).rfind(b"BC"): -42%
    • s=b"ABC"*33; (s+b"E"+(b"D"+s)*500).rfind(s+b"E"): -28%

    find (not found):

    • (b"A"*1000).find(b"B") : -48%
    • (b"A"*1000).rfind(b"B") : -37%
    • (b"AB"*1000).rfind(b"CA") : -20%
    • (b"AB"*1000).rfind(b"BC") : -35%

    others:

    • b"B" in b"A"*1000 : -24%
    • (b"A"*1000).partition(b"B") : -30%

    @vstinner
    Copy link
    Member

    See also the issue bpo-13621 for results on Unicode.

    @vstinner
    Copy link
    Member

    (b"A"*1000).find(b"A") : -70%

    This one is a performance regression introduced by bpo-12170. Attached patch checks object type before trying a conversion to size_t instead of catching an exception.

    @vstinner
    Copy link
    Member

    bytes_find.patch only works for Python int, not object with the __index__ method. My new patch (bytes_find-2.patch) uses PyNumber_Check() instead of PyLong_Check() to be more generic. It fixes also a different issue: raise the same ValueError than bytes.find(-1) on overflow error.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 18, 2011

    New changeset 75648db1b3f3 by Victor Stinner in branch 'default':
    Issue bpo-13623: Fix a performance regression introduced by issue bpo-12170 in
    http://hg.python.org/cpython/rev/75648db1b3f3

    @vstinner
    Copy link
    Member

    I checked stringbench: there is no more performance regression (difference of more than 20%).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant