Author barry
Recipients barry
Date 2016-10-28.15:57:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1477670229.81.0.419990644201.issue28548@psf.upfronthosting.co.za>
In-reply-to
Content
This might also affect other Python version; I haven't checked, but I know it affects Python 3.5.

In Mailman 3, we have a subclass of WSGIRequestHandler for our REST API, and we got a bug report about an error condition returning a 400 in the body of the response, but still returning an implicit 200 header.

https://gitlab.com/mailman/mailman/issues/288

This is pretty easily reproducible with the following recipe.

$ git clone https://gitlab.com/mailman/mailman.git
$ cd mailman
$ tox -e py35 --notest -r
$ .tox/py35/bin/python3.5 /home/barry/projects/mailman/trunk/.tox/py35/bin/runner --runner=rest:0:1 -C /home/barry/projects/mailman/trunk/var/etc/mailman.cfg 

(Note that you might need to run `.tox/py35/bin/mailman info` first, and of course you'll have to adjust the directories for your own local fork.)

Now, in another shell, do the following:

$ curl -v -i -u restadmin:restpass "http://localhost:8001/3.0/lists/list example.com"

Note specifically that there is a space right before the "example.com" bit.

Take note also that we're generating an HTTP/1.1 request as per curl default.

The response you get is:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8001 (#0)
* Server auth using Basic with user 'restadmin'
> GET /3.0/lists/list example.com HTTP/1.1
> Host: localhost:8001
> Authorization: Basic cmVzdGFkbWluOnJlc3RwYXNz
> User-Agent: curl/7.50.1
> Accept: */*
> 
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 400</p>
        <p>Message: Bad request syntax ('GET /3.0/lists/list example.com HTTP/1.1').</p>
        <p>Error code explanation: HTTPStatus.BAD_REQUEST - Bad request syntax or unsupported method.</p>
    </body>
</html>
* Connection #0 to host localhost left intact


Notice the lack of response headers, and thus the implicit 200 return even though the proper error code is in the body of the response.  Why does this happen?

Now look at http.server.BaseHTTPRequestHandler.  The default_request_version is "HTTP/0.9".  Given the request, you'd expect to see the version set to "HTTP/1.1", but that doesn't happen because the extra space messes up the request parsing.  parse_request() splits the request line by spaces and when this happens, the wrong number of words shows up.  We get 4 words, thus the 'else:' clause in parse_request() gets triggered.  So far so good.

This eventually leads us to send_error() and from there into send_response() with the error code (properly 400) and message.  From there we get to .send_response_only() and tracing into this function shows you where things go wrong.

send_response_only() has an explicit test on self.request_version != 'HTTP/0.9', in which case it adds nothing to the _header_buffer.  Well, because the request parsing got the unexpected number of words, in fact request_version *is* the default HTTP/0.9.  Thus the error headers are never added.

There are several problems here.  Why are the headers never added when the request is HTTP/0.9?  (I haven't read the spec.)  Also, since parse_request() isn't setting the request version to 'HTTP/1.1'.  It should probably dig out the words[-1] and see if that looks like a version string.

Clearly the request isn't properly escaped, but http.server should not be sending an implicit 200 when the request is bogus.
History
Date User Action Args
2016-10-28 15:57:09barrysetrecipients: + barry
2016-10-28 15:57:09barrysetmessageid: <1477670229.81.0.419990644201.issue28548@psf.upfronthosting.co.za>
2016-10-28 15:57:09barrylinkissue28548 messages
2016-10-28 15:57:08barrycreate