classification
Title: Simple HTTP Request Handler in http.server does not set a content-length and does not close connections on 301s
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: miss-islington, orsenthil, sirosen
Priority: normal Keywords: patch

Created on 2021-04-29 01:06 by sirosen, last changed 2021-05-06 19:57 by orsenthil. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25705 merged sirosen, 2021-04-29 01:10
PR 25952 merged miss-islington, 2021-05-06 19:26
PR 25953 merged miss-islington, 2021-05-06 19:26
Messages (9)
msg392272 - (view) Author: Stephen Rosen (sirosen) * Date: 2021-04-29 01:06
If you use the `http.server` simple server and handler to serve a directory, navigating to a directory name without a trailing slash will trigger a 301 to add the trailing slash.

For example, if "foo/" is a directory under the file server, a GET for "/foo" will receive a 301 Moved Permanently response with a Location header pointing at "/foo/".

However, the response is sent without a "Content-Length: 0" and the connection is not closed. Unfortunately, certain clients will hang indefinitely and wait under these conditions, without processing the redirect. In my testing, curl 7.68 and Firefox 87 both exhibted this behavior.

If a Content-Length header is set, these clients behave correctly.
For example, subclass the handler and add

    def send_response(self, code):
        super().send_response(code)
        if code == HTTPStatus.MOVED_PERMANENTLY:
            self.send_header("Content-Length", "0")
msg392610 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-05-01 17:18
Hi Stephen,

Could you give a brief demo of using curl to see the problematic behavior.

I have testing with a version python and saw that without content length, the curl was behaving properly.

```
$mkdir foo
$#add index.html to directory foo
$python -m http.server foo
$ curl -I -L http://0.0.0.0:8082/foo
HTTP/1.0 301 Moved Permanently
Server: SimpleHTTP/0.6 Python/3.6.13+
Date: Sat, 01 May 2021 17:16:14 GMT
Location: /foo/

HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.6.13+
Date: Sat, 01 May 2021 17:16:14 GMT
Content-type: text/html
Content-Length: 171
Last-Modified: Sat, 01 May 2021 14:34:48 GMT
```

And 

```
curl --version
curl 7.65.3
```
msg392650 - (view) Author: Stephen Rosen (sirosen) * Date: 2021-05-02 01:13
Ach! Sorry! I didn't even realize this but the issue only arises when you are modifying the handler to set the protocol to HTTP/1.1 .

In HTTP/1.0 , there's no notion of persistent connections, so the issue does not arise.

But when the protocol version changes to 1.1 , persistent connections are the norm, and curl will wait indefinitely.

The following short script is sufficient to reproduce:
```
import http.server


class CustomRequestHandler(http.server.SimpleHTTPRequestHandler):
    protocol_version = "HTTP/1.1"


with http.server.HTTPServer(("", 8000), CustomRequestHandler) as httpd:
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print("\nKeyboard interrupt received, exiting.")
```

After double-checking the docs, the current doc for `protocol_version` [1] is quite clear about this:
"your server must then include an accurate Content-Length header (using send_header()) in all of its responses to clients"

I still think the fix I proposed is an improvement. Setting a Content-Length isn't forbidden in HTTP/1.0 , and it guarantees good behavior when HTTP/1.1 is used.

[1] https://docs.python.org/3/library/http.server.html#http.server.BaseHTTPRequestHandler.protocol_version
msg393054 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-05-05 23:41
Hi Stephen, 

With the example, I couldn't reproduce the problem with curl 7.65.3

That said, I do recognize that this change is a positive improvement, but I cannot see this a bug-fix (and for client misbehavior, which I couldn't verify).

To take a call, I think, this change could go into `main` branch as an "improvement" change than a bug-fix. 

Note: the existing behavior is 10+ year old and don't want to introduce changes if it is not a bug.


Thanks
msg393057 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-05-06 00:22
I am also hesitant to fix something that is not broken. So, please share the broken behavior that could be verified, and this will give us greater confidence to commit this patch.
msg393106 - (view) Author: Stephen Rosen (sirosen) * Date: 2021-05-06 16:28
Thanks for working with me to reproduce and understand the issue. I'm a little surprised that with the sample which sets the protocol version you're still not seeing the issue.

If I create a directory tree, e.g.

repro
├── foo/
└── server.py

where `server.py` is the sample I gave, and run `server.py`, I find that `curl localhost:8000/foo` hangs. `curl -v` includes a message as part of its output which states that it's waiting for the connection to close.

Full verbose output:
```
$ curl localhost:8000/foo -v
*   Trying 127.0.0.1:8000...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /foo HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< Server: SimpleHTTP/0.6 Python/3.8.5
< Date: Thu, 06 May 2021 15:53:13 GMT
< Location: /foo/
* no chunk, no close, no size. Assume close to signal end
<
^C
```


This holds over a few python versions: 3.6.12, 3.8.5, and 3.9.1 . That's probably a good enough sample since the relevant code hasn't changed in the stdlib.

It's doubtful that the exact version of curl matters for this. I can also see the issue with Firefox opening `localhost:8000/foo`. It hangs without processing the redirect.


Running the sample I gave, you're seeing curl exit cleanly? I wonder, with verbose output, maybe there's some useful message that will tell us why it's exiting. Does it not print the message, "no chunk, no close, no size. Assume close to signal end" ?


> Note: the existing behavior is 10+ year old and don't want to introduce changes if it is not a bug.

I completely understand this stance. I believe it is a bug, but that it's rare enough that hasn't been filed or resolved, in spite of its age.

Some browsers (e.g. Chrome) process redirects without waiting for a payload, so they would mask the issue. Plus, it only shows up when the protocol_version is set.

I had a script at work with this issue for over a year without anyone running into the hangs. A coworker who prefers Firefox noticed the issue only recently, and I traced that back to this behavior.
So even in my case, I didn't stumble across this issue until we'd been using the same test script with the bug in it for a long time.
msg393122 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2021-05-06 18:03
Hi Stephen, 

Thanks for the response and the details. I was able to verify the bug!
I don't know exactly what I was doing previously, but I agree with you that this is a bug and will be fixed with your patch. :) 

Thanks,
Senthil
msg393140 - (view) Author: miss-islington (miss-islington) Date: 2021-05-06 19:48
New changeset 058f9b27d3838f04bbb313074941e9f9946a33bc by Miss Islington (bot) in branch '3.10':
bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705)
https://github.com/python/cpython/commit/058f9b27d3838f04bbb313074941e9f9946a33bc
msg393143 - (view) Author: miss-islington (miss-islington) Date: 2021-05-06 19:51
New changeset b391b9b9255697ce6028bb4e7a99c18080aa991c by Miss Islington (bot) in branch '3.9':
bpo-43972: Set content-length to 0 for http.server.SimpleHTTPRequestHandler 301s (GH-25705)
https://github.com/python/cpython/commit/b391b9b9255697ce6028bb4e7a99c18080aa991c
History
Date User Action Args
2021-05-06 19:57:43orsenthilsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-05-06 19:51:02miss-islingtonsetmessages: + msg393143
2021-05-06 19:48:43miss-islingtonsetmessages: + msg393140
2021-05-06 19:26:16miss-islingtonsetpull_requests: + pull_request24616
2021-05-06 19:26:11miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24615
2021-05-06 18:03:25orsenthilsetversions: + Python 3.9, Python 3.10
2021-05-06 18:03:17orsenthilsetmessages: + msg393122
2021-05-06 16:28:50sirosensetmessages: + msg393106
2021-05-06 00:22:05orsenthilsetmessages: + msg393057
2021-05-05 23:41:22orsenthilsetmessages: + msg393054
versions: + Python 3.11, - Python 3.6, Python 3.7, Python 3.8, Python 3.9
2021-05-02 01:13:20sirosensetmessages: + msg392650
2021-05-01 17:18:26orsenthilsetmessages: + msg392610
2021-04-30 16:05:14orsenthilsetassignee: orsenthil

nosy: + orsenthil
2021-04-29 01:10:01sirosensetkeywords: + patch
stage: patch review
pull_requests: + pull_request24395
2021-04-29 01:06:37sirosencreate