Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimization for append-only StringIO #57358

Closed
pitrou opened this issue Oct 11, 2011 · 6 comments
Closed

optimization for append-only StringIO #57358

pitrou opened this issue Oct 11, 2011 · 6 comments
Labels
performance Performance or resource usage topic-IO

Comments

@pitrou
Copy link
Member

pitrou commented Oct 11, 2011

BPO 13149
Nosy @loewis, @terryjreedy, @pitrou, @vstinner
Files
  • stringio.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-11-10.21:52:54.559>
    created_at = <Date 2011-10-11.01:17:34.010>
    labels = ['expert-IO', 'performance']
    title = 'optimization for append-only StringIO'
    updated_at = <Date 2011-11-10.21:52:54.558>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2011-11-10.21:52:54.558>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-11-10.21:52:54.559>
    closer = 'pitrou'
    components = ['IO']
    creation = <Date 2011-10-11.01:17:34.010>
    creator = 'pitrou'
    dependencies = []
    files = ['23373']
    hgrepos = []
    issue_num = 13149
    keywords = ['patch']
    message_count = 6.0
    messages = ['145322', '145400', '145404', '145571', '147411', '147412']
    nosy_count = 5.0
    nosy_names = ['loewis', 'terry.reedy', 'pitrou', 'vstinner', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue13149'
    versions = ['Python 3.3']

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 11, 2011

    io.StringIO is quite slower than ''.append() when used for mass concatenation (around 5x slower). This patch brings it to similar performance by deferring construction of the internal buffer until needed.

    The problem is that it's very easy to disable the optimization by calling a method other than write() and getvalue().

    @pitrou pitrou added topic-IO performance Performance or resource usage labels Oct 11, 2011
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 12, 2011

    It would be interesting to see how often the "bad" case triggers, i.e. that a write-only stringio sees any of the other methods invoked at all.
    As a special case, you may consider that .truncate(0) doesn't really need to realize the buffer first.

    I also wonder how much StringIO will be used in praxis, as opposed to BytesIO.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 12, 2011

    Yes, these are things I've been wondering about. The use-case for an append-only StringIO is obviously overlapping with the use-case for using ''.join(). However, the implementation I'm proposing is better than ''.join() when writing very small strings, since there's a periodic consolidation.

    As a special case, you may consider that .truncate(0) doesn't really
    need to realize the buffer first.

    True. Also, seek(0) then read() could use the same optimization.

    @terryjreedy
    Copy link
    Member

    Like parts of the Python test suite, I use StringIO to capture print/write output for testing in an output...output/getvalue/reset(seek(0),truncate(0)) cycle. While this enhancement would not currently affect me (as I only do a few prints each cycle), I can easily imagine other cases where it would.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 10, 2011

    New changeset 8d9a869db675 by Antoine Pitrou in branch 'default':
    Issue bpo-13149: Speed up append-only StringIO objects.
    http://hg.python.org/cpython/rev/8d9a869db675

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 10, 2011

    I've committed an improved version (which also optimizes seek(0); read()).

    @pitrou pitrou closed this as completed Nov 10, 2011
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage topic-IO
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants