Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple HTTP Request Handler in http.server does not set a content-length and does not close connections on 301s #88138

Closed
sirosen mannequin opened this issue Apr 29, 2021 · 9 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir

Comments

@sirosen
Copy link
Mannequin

sirosen mannequin commented Apr 29, 2021

BPO 43972
Nosy @orsenthil, @sirosen, @miss-islington
PRs
  • bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s #25705
  • [3.9] bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705) #25952
  • [3.10] bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705) #25953
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = <Date 2021-05-06.19:57:43.295>
    created_at = <Date 2021-04-29.01:06:37.015>
    labels = ['library', '3.9', '3.10', '3.11']
    title = 'Simple HTTP Request Handler in http.server does not set a content-length and does not close connections on 301s'
    updated_at = <Date 2021-05-06.19:57:43.294>
    user = 'https://github.com/sirosen'

    bugs.python.org fields:

    activity = <Date 2021-05-06.19:57:43.294>
    actor = 'orsenthil'
    assignee = 'orsenthil'
    closed = True
    closed_date = <Date 2021-05-06.19:57:43.295>
    closer = 'orsenthil'
    components = ['Library (Lib)']
    creation = <Date 2021-04-29.01:06:37.015>
    creator = 'sirosen'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43972
    keywords = ['patch']
    message_count = 9.0
    messages = ['392272', '392610', '392650', '393054', '393057', '393106', '393122', '393140', '393143']
    nosy_count = 3.0
    nosy_names = ['orsenthil', 'sirosen', 'miss-islington']
    pr_nums = ['25705', '25952', '25953']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue43972'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @sirosen
    Copy link
    Mannequin Author

    sirosen mannequin commented Apr 29, 2021

    If you use the http.server simple server and handler to serve a directory, navigating to a directory name without a trailing slash will trigger a 301 to add the trailing slash.

    For example, if "foo/" is a directory under the file server, a GET for "/foo" will receive a 301 Moved Permanently response with a Location header pointing at "/foo/".

    However, the response is sent without a "Content-Length: 0" and the connection is not closed. Unfortunately, certain clients will hang indefinitely and wait under these conditions, without processing the redirect. In my testing, curl 7.68 and Firefox 87 both exhibted this behavior.

    If a Content-Length header is set, these clients behave correctly.
    For example, subclass the handler and add

        def send_response(self, code):
            super().send_response(code)
            if code == HTTPStatus.MOVED_PERMANENTLY:
                self.send_header("Content-Length", "0")

    @sirosen sirosen mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir labels Apr 29, 2021
    @orsenthil
    Copy link
    Member

    Hi Stephen,

    Could you give a brief demo of using curl to see the problematic behavior.

    I have testing with a version python and saw that without content length, the curl was behaving properly.

    $mkdir foo
    $#add index.html to directory foo
    $python -m http.server foo
    $ curl -I -L http://0.0.0.0:8082/foo
    HTTP/1.0 301 Moved Permanently
    Server: SimpleHTTP/0.6 Python/3.6.13+
    Date: Sat, 01 May 2021 17:16:14 GMT
    Location: /foo/
    
    HTTP/1.0 200 OK
    Server: SimpleHTTP/0.6 Python/3.6.13+
    Date: Sat, 01 May 2021 17:16:14 GMT
    Content-type: text/html
    Content-Length: 171
    Last-Modified: Sat, 01 May 2021 14:34:48 GMT
    

    And

    curl --version
    curl 7.65.3
    

    @sirosen
    Copy link
    Mannequin Author

    sirosen mannequin commented May 2, 2021

    Ach! Sorry! I didn't even realize this but the issue only arises when you are modifying the handler to set the protocol to HTTP/1.1 .

    In HTTP/1.0 , there's no notion of persistent connections, so the issue does not arise.

    But when the protocol version changes to 1.1 , persistent connections are the norm, and curl will wait indefinitely.

    The following short script is sufficient to reproduce:

    import http.server
    
    
    class CustomRequestHandler(http.server.SimpleHTTPRequestHandler):
        protocol_version = "HTTP/1.1"
    
    
    with http.server.HTTPServer(("", 8000), CustomRequestHandler) as httpd:
        try:
            httpd.serve_forever()
        except KeyboardInterrupt:
            print("\nKeyboard interrupt received, exiting.")
    

    After double-checking the docs, the current doc for protocol_version [1] is quite clear about this:
    "your server must then include an accurate Content-Length header (using send_header()) in all of its responses to clients"

    I still think the fix I proposed is an improvement. Setting a Content-Length isn't forbidden in HTTP/1.0 , and it guarantees good behavior when HTTP/1.1 is used.

    [1] https://docs.python.org/3/library/http.server.html#http.server.BaseHTTPRequestHandler.protocol_version

    @orsenthil
    Copy link
    Member

    Hi Stephen,

    With the example, I couldn't reproduce the problem with curl 7.65.3

    That said, I do recognize that this change is a positive improvement, but I cannot see this a bug-fix (and for client misbehavior, which I couldn't verify).

    To take a call, I think, this change could go into main branch as an "improvement" change than a bug-fix.

    Note: the existing behavior is 10+ year old and don't want to introduce changes if it is not a bug.

    Thanks

    @orsenthil orsenthil added 3.11 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels May 5, 2021
    @orsenthil
    Copy link
    Member

    I am also hesitant to fix something that is not broken. So, please share the broken behavior that could be verified, and this will give us greater confidence to commit this patch.

    @sirosen
    Copy link
    Mannequin Author

    sirosen mannequin commented May 6, 2021

    Thanks for working with me to reproduce and understand the issue. I'm a little surprised that with the sample which sets the protocol version you're still not seeing the issue.

    If I create a directory tree, e.g.

    repro
    ├── foo/
    └── server.py

    where server.py is the sample I gave, and run server.py, I find that curl localhost:8000/foo hangs. curl -v includes a message as part of its output which states that it's waiting for the connection to close.

    Full verbose output:

    $ curl localhost:8000/foo -v
    *   Trying 127.0.0.1:8000...
    * TCP_NODELAY set
    * Connected to localhost (127.0.0.1) port 8000 (#0)
    > GET /foo HTTP/1.1
    > Host: localhost:8000
    > User-Agent: curl/7.68.0
    > Accept: */*
    >
    * Mark bundle as not supporting multiuse
    < HTTP/1.1 301 Moved Permanently
    < Server: SimpleHTTP/0.6 Python/3.8.5
    < Date: Thu, 06 May 2021 15:53:13 GMT
    < Location: /foo/
    * no chunk, no close, no size. Assume close to signal end
    <
    ^C
    

    This holds over a few python versions: 3.6.12, 3.8.5, and 3.9.1 . That's probably a good enough sample since the relevant code hasn't changed in the stdlib.

    It's doubtful that the exact version of curl matters for this. I can also see the issue with Firefox opening localhost:8000/foo. It hangs without processing the redirect.

    Running the sample I gave, you're seeing curl exit cleanly? I wonder, with verbose output, maybe there's some useful message that will tell us why it's exiting. Does it not print the message, "no chunk, no close, no size. Assume close to signal end" ?

    Note: the existing behavior is 10+ year old and don't want to introduce changes if it is not a bug.

    I completely understand this stance. I believe it is a bug, but that it's rare enough that hasn't been filed or resolved, in spite of its age.

    Some browsers (e.g. Chrome) process redirects without waiting for a payload, so they would mask the issue. Plus, it only shows up when the protocol_version is set.

    I had a script at work with this issue for over a year without anyone running into the hangs. A coworker who prefers Firefox noticed the issue only recently, and I traced that back to this behavior.
    So even in my case, I didn't stumble across this issue until we'd been using the same test script with the bug in it for a long time.

    @orsenthil
    Copy link
    Member

    Hi Stephen,

    Thanks for the response and the details. I was able to verify the bug!
    I don't know exactly what I was doing previously, but I agree with you that this is a bug and will be fixed with your patch. :)

    Thanks,
    Senthil

    @orsenthil orsenthil added 3.9 only security fixes 3.10 only security fixes labels May 6, 2021
    @miss-islington
    Copy link
    Contributor

    New changeset 058f9b2 by Miss Islington (bot) in branch '3.10':
    bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705)
    058f9b2

    @miss-islington
    Copy link
    Contributor

    New changeset b391b9b by Miss Islington (bot) in branch '3.9':
    bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705)
    b391b9b

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants