Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseHTTPRequestHandler innefficient when sending HTTP header #47959

Closed
yangman mannequin opened this issue Aug 27, 2008 · 16 comments
Closed

BaseHTTPRequestHandler innefficient when sending HTTP header #47959

yangman mannequin opened this issue Aug 27, 2008 · 16 comments
Assignees
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@yangman
Copy link
Mannequin

yangman mannequin commented Aug 27, 2008

BPO 3709
Nosy @terryjreedy, @gpshead, @orsenthil, @bitdancer
Files
  • issue3709.diff
  • issue3709-from86665.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = <Date 2011-05-09.15:26:38.582>
    created_at = <Date 2008-08-27.23:43:46.604>
    labels = ['library', 'performance']
    title = 'BaseHTTPRequestHandler innefficient when sending HTTP header'
    updated_at = <Date 2011-05-09.15:37:56.668>
    user = 'https://bugs.python.org/yangman'

    bugs.python.org fields:

    activity = <Date 2011-05-09.15:37:56.668>
    actor = 'Arfrever'
    assignee = 'orsenthil'
    closed = True
    closed_date = <Date 2011-05-09.15:26:38.582>
    closer = 'orsenthil'
    components = ['Library (Lib)']
    creation = <Date 2008-08-27.23:43:46.604>
    creator = 'yangman'
    dependencies = []
    files = ['19704', '19764']
    hgrepos = []
    issue_num = 3709
    keywords = ['patch']
    message_count = 16.0
    messages = ['72051', '73551', '107973', '107989', '121717', '121719', '121721', '121750', '121759', '121815', '121933', '121937', '121943', '122066', '135598', '135599']
    nosy_count = 8.0
    nosy_names = ['terry.reedy', 'gregory.p.smith', 'orsenthil', 'brunogola', 'yangman', 'r.david.murray', 'endian', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue3709'
    versions = ['Python 3.3']

    @yangman
    Copy link
    Mannequin Author

    yangman mannequin commented Aug 27, 2008

    send_header() in BaseHTTPRequestHandler currently does a write to socket
    every time send_header() is called. This results in excessive number of
    TCP packets being regenerated. Ideally, as much of the HTTP packet is
    buffered as possible, but, at minimum, the header should be sent with a
    single write as there is a convenient end_header() functional available.

    Behaviour is observed under python 2.5, but the related code looks
    identical in SVN trunk.

    Will contribute patch if request is deemed reasonable but no one is
    available to work on it; I just need a few days.

    @yangman yangman mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Aug 27, 2008
    @gpshead
    Copy link
    Member

    gpshead commented Sep 22, 2008

    Buffering up header IO and sending it all at once is always a good thing
    to do. A patch and unit test would be greatly appreciated.

    @gpshead gpshead self-assigned this Sep 22, 2008
    @terryjreedy
    Copy link
    Member

    Does this issue apply to 3.1/2?

    @orsenthil
    Copy link
    Member

    Yes, it is applicable to 3.1 and 3.2 as well. This is definitely a good to have performance improvement.

    @endian
    Copy link
    Mannequin

    endian mannequin commented Nov 20, 2010

    How about this patch?

    @bitdancer
    Copy link
    Member

    It looks good, but as mentioned on IRC it would be nice to have a unit test that confirmed that the headers were being correctly buffered.

    @endian
    Copy link
    Mannequin

    endian mannequin commented Nov 20, 2010

    Working on it...

    @endian
    Copy link
    Mannequin

    endian mannequin commented Nov 20, 2010

    Ready for review. Added test_header_buffering which fails without the BaseHTTPRequestHandler patch and passes with it.

    @brunogola
    Copy link
    Mannequin

    brunogola mannequin commented Nov 20, 2010

    applied the patch on py3k on trunk and everything seems fine (works here)

    @bitdancer
    Copy link
    Member

    Thanks for the patch and test. They look good.

    A doc update is also needed, since the docs are currently written in such a way that buffering the header lines makes the documentation no longer true.

    Also, send_response_only writes directly to the output stream, it doesn't call send_header. This breaks the buffering. It also means that more test cases are needed, since the added test didn't catch this case.

    @orsenthil
    Copy link
    Member

    Fixed this revision 86640.

    • Even though this is an internal optimization,I don't think back-porting is a good idea, because it changes the behavior of certain methods like send_header and end_headers

    • Added the Documentation and other test scenario which was suggested by RDM.

    'endian': In the NEWS entry, I have added your handle. If you want me to mention your Real Name, let me know.

    @orsenthil orsenthil assigned orsenthil and unassigned gpshead Nov 21, 2010
    @bitdancer
    Copy link
    Member

    Senthil, I didn't clearly express my concern about send_response_only. It doesn't look to me like, with buffering in place, that it *should* write directly, it looks to me like it should write to the buffer. Consider specifically the fact that send_response_only is called from send_response_only, which is in turn called from send_error.

    So IMO the unit test should at a minimum test that send_response_only buffers, and ideally it should test that send_error buffers (but that would require a somewhat different test harness since it would need to report that its write method was called only once).

    I think the code you committed is still valid, it just doesn't quite do full header buffering.

    And I agree that this should not be backported.

    @bitdancer bitdancer reopened this Nov 21, 2010
    @orsenthil
    Copy link
    Member

    On Sun, Nov 21, 2010 at 03:15:06PM +0000, R. David Murray wrote:

    Senthil, I didn't clearly express my concern about
    send_response_only. It doesn't look to me like, with buffering in
    place, that it *should* write directly, it looks to me like it
    should write to the buffer.

    Correct. Now that I re-looked at the code, keeping your above
    statement in mind", I tend to agree with you and notice the problem. I
    think, the headers_buffer should be moved higher up in the class so
    that it buffer the headers from all these methods which atm write
    directly to the output stream.

    @endian
    Copy link
    Mannequin

    endian mannequin commented Nov 22, 2010

    This patch (to 86665)...

    • adds buffering to send_response_only
    • adds flush_headers and uses it in end_headers and CGIHTTPRequestHandler.run_cgi
    • tests {send_error,send_response_only,send_header} using a write-counting wfile
    • improves the docs

    Also, I've added my name to this account.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 9, 2011

    New changeset 25298224cb25 by Senthil Kumaran in branch 'default':
    Issue bpo-3709: a flush_headers method to BaseHTTPRequestHandler which manages the
    http://hg.python.org/cpython/rev/25298224cb25

    @orsenthil
    Copy link
    Member

    Added the flush_headers method and the test function. this issue can be closed now. Thanks, Andrew Schaaf.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants