classification
Title: http.server parse_request() bug and error reporting
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, martin.panter, python-dev
Priority: normal Keywords: patch

Created on 2016-10-28 15:57 by barry, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
parse-version.patch martin.panter, 2016-10-29 02:20 review
parse-version.v2.patch martin.panter, 2016-11-01 01:45 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (9)
msg279611 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016-10-28 15:57
This might also affect other Python version; I haven't checked, but I know it affects Python 3.5.

In Mailman 3, we have a subclass of WSGIRequestHandler for our REST API, and we got a bug report about an error condition returning a 400 in the body of the response, but still returning an implicit 200 header.

https://gitlab.com/mailman/mailman/issues/288

This is pretty easily reproducible with the following recipe.

$ git clone https://gitlab.com/mailman/mailman.git
$ cd mailman
$ tox -e py35 --notest -r
$ .tox/py35/bin/python3.5 /home/barry/projects/mailman/trunk/.tox/py35/bin/runner --runner=rest:0:1 -C /home/barry/projects/mailman/trunk/var/etc/mailman.cfg 

(Note that you might need to run `.tox/py35/bin/mailman info` first, and of course you'll have to adjust the directories for your own local fork.)

Now, in another shell, do the following:

$ curl -v -i -u restadmin:restpass "http://localhost:8001/3.0/lists/list example.com"

Note specifically that there is a space right before the "example.com" bit.

Take note also that we're generating an HTTP/1.1 request as per curl default.

The response you get is:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8001 (#0)
* Server auth using Basic with user 'restadmin'
> GET /3.0/lists/list example.com HTTP/1.1
> Host: localhost:8001
> Authorization: Basic cmVzdGFkbWluOnJlc3RwYXNz
> User-Agent: curl/7.50.1
> Accept: */*
> 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 400</p>
        <p>Message: Bad request syntax ('GET /3.0/lists/list example.com HTTP/1.1').</p>
        <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
    </body>
</html>
* Connection #0 to host localhost left intact


Notice the lack of response headers, and thus the implicit 200 return even though the proper error code is in the body of the response.  Why does this happen?

Now look at http.server.BaseHTTPRequestHandler.  The default_request_version is "HTTP/0.9".  Given the request, you'd expect to see the version set to "HTTP/1.1", but that doesn't happen because the extra space messes up the request parsing.  parse_request() splits the request line by spaces and when this happens, the wrong number of words shows up.  We get 4 words, thus the 'else:' clause in parse_request() gets triggered.  So far so good.

This eventually leads us to send_error() and from there into send_response() with the error code (properly 400) and message.  From there we get to .send_response_only() and tracing into this function shows you where things go wrong.

send_response_only() has an explicit test on self.request_version != 'HTTP/0.9', in which case it adds nothing to the _header_buffer.  Well, because the request parsing got the unexpected number of words, in fact request_version *is* the default HTTP/0.9.  Thus the error headers are never added.

There are several problems here.  Why are the headers never added when the request is HTTP/0.9?  (I haven't read the spec.)  Also, since parse_request() isn't setting the request version to 'HTTP/1.1'.  It should probably dig out the words[-1] and see if that looks like a version string.

Clearly the request isn't properly escaped, but http.server should not be sending an implicit 200 when the request is bogus.
msg279634 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-10-28 22:42
I think you should be able to reproduce this without Mailman or tox, by just running “python -m http.server”.

The problem is the “HTTP/0.9” protocol that Python is assuming does not include a header section, so there is no place to put a 400 status code or header fields. The HTTP 0.9 response is supposed to only be a HTML body; see <https://www.w3.org/Protocols/HTTP/AsImplemented.html> and <https://tools.ietf.org/html/rfc1945#section-6>.

I think we should drop HTTP 0.9 response support from Python’s HTTP server, as well as the attempted but buggy request support. But there was a bit of resistance; see Issue 10721.

Another possibility would be to change default_request_version so that error responses are sent as HTTP 1.0. But there may be more fixes needed for this to continue the buggy HTTP 0.9 support; see Issue 26578.
msg279637 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-10-28 22:57
Parsing the version from words[-1] rather than words[2] may be a minor improvement however.
msg279639 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016-10-28 23:04
On Oct 28, 2016, at 10:42 PM, Martin Panter wrote:

>I think you should be able to reproduce this without Mailman or tox, by just
>running “python -m http.server”.

Yep, indeed.

>The problem is the “HTTP/0.9” protocol that Python is assuming does not
>include a header section, so there is no place to put a 400 status code or
>header fields. The HTTP 0.9 response is supposed to only be a HTML body; see
><https://www.w3.org/Protocols/HTTP/AsImplemented.html> and
><https://tools.ietf.org/html/rfc1945#section-6>.
>
>I think we should drop HTTP 0.9 response support from Python’s HTTP server,
>as well as the attempted but buggy request support. But there was a bit of
>resistance; see Issue 10721.
>
>Another possibility would be to change default_request_version so that error
>responses are sent as HTTP 1.0. But there may be more fixes needed for this
>to continue the buggy HTTP 0.9 support; see Issue 26578.

I think it may indeed make sense to set default_request_version to 1.0.  It's
probably too late to do that for 3.6, but I do think it makes sense for 3.7.
I'm nosying on the issues you mentioned.

I've proposed essentially that workaround for Mailman.

https://gitlab.com/mailman/mailman/issues/288

We can get away with requiring at least HTTP/1.1 for our REST clients.
msg279640 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016-10-28 23:05
On Oct 28, 2016, at 10:57 PM, Martin Panter wrote:

>Parsing the version from words[-1] rather than words[2] may be a minor
>improvement however.

Indeed; I thought about that too.
msg279646 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-10-29 02:20
Here is a patch to parse the version from the request in more cases. Since it is more of a cosmetic improvement for handling erroneous requests, I would probably only apply it to the next Python version (3.7 atm). Or do you think it should go into bug fix versions as well?
msg279673 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016-10-29 13:53
On Oct 29, 2016, at 02:20 AM, Martin Panter wrote:

>Here is a patch to parse the version from the request in more cases. Since it
>is more of a cosmetic improvement for handling erroneous requests, I would
>probably only apply it to the next Python version (3.7 atm). Or do you think
>it should go into bug fix versions as well?

That would be a RM decision.
msg281190 - (view) Author: Roundup Robot (python-dev) Date: 2016-11-19 01:19
New changeset 7c98768368cb by Martin Panter in branch 'default':
Issue #28548: Parse HTTP request version even if too many words received
https://hg.python.org/cpython/rev/7c98768368cb
msg281194 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-11-19 03:06
Okay committed to 3.7 for the moment. I think that is all we can reasonably do unless we drop the pseudo-HTTP-0.9 support.
History
Date User Action Args
2017-03-31 16:36:25dstufftsetpull_requests: + pull_request986
2016-11-19 03:06:33martin.pantersetstatus: open -> closed
resolution: fixed
messages: + msg281194

stage: patch review -> resolved
2016-11-19 01:19:30python-devsetnosy: + python-dev
messages: + msg281190
2016-11-01 01:45:31martin.pantersetfiles: + parse-version.v2.patch
versions: + Python 3.7, - Python 3.5
2016-10-29 13:53:40barrysetmessages: + msg279673
2016-10-29 02:20:29martin.pantersetfiles: + parse-version.patch
keywords: + patch
messages: + msg279646

stage: patch review
2016-10-28 23:05:08barrysetmessages: + msg279640
2016-10-28 23:04:38barrysetmessages: + msg279639
2016-10-28 22:57:56martin.pantersetmessages: + msg279637
2016-10-28 22:42:15martin.pantersetnosy: + martin.panter
messages: + msg279634
2016-10-28 15:57:09barrycreate