classification
Title: http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response
Type: enhancement Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Paul.Upchurch, karlcow, orsenthil, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2011-09-06 18:41 by Paul.Upchurch, last changed 2013-03-15 15:03 by karlcow. This issue is now closed.

Files
File name Uploaded Description Edit
server.issue12921.patch karlcow, 2013-02-27 00:27
issue-12921-2.patch karlcow, 2013-03-04 16:26 review
issue-12921-3.patch karlcow, 2013-03-05 17:05 review
Messages (21)
msg143646 - (view) Author: Paul Upchurch (Paul.Upchurch) Date: 2011-09-06 18:41
Calling http.server.BaseHTTPRequestHandler.send_error(code,message) with a message that contains a trailing newline does not display properly in Firefox, Chrome.

Listing 1

#!/usr/bin/env python3.2

import http.server
import traceback

class httphandler(http.server.BaseHTTPRequestHandler):
  def do_GET(self):
    try:
      assert(False)
    except:
      self.send_error(500,traceback.format_exc())

if __name__=='__main__':
  addr=('',9000)
  http.server.HTTPServer(addr,httphandler).serve_forever()

Consider Listing 1. A typical programming idiom would be to wrap do_GET with a try/except block that reports an HTTP error with an HTML formatted stack trace. However, when I view this with Firefox and Chrome the error message is unformatted, i.e. raw HTML is displayed.

A simple workaround is to call strip() on the message. This could be suggested to the user in the docs, or added as an automatic preprocessing feature to the library. With strip(), the message is formatted.

Adding or documenting strip() resolves the bug. The remainder of this report is a feature request.

The default error_message_format is probably not what people want for a stack trace. It leaves formatting of whitespace to the HTML which removes the newlines. This is desirable for short, unformatted messages but undesirable for a preformatted traceback. In Listing 2 I give a working solution and suggest including it in the library if the community feels that preformatted messages are common enough to warrant extra attention. I feel it is since "try/except: print traceback" is almost mandatory for error prone internet operations.

Listing 2

#!/usr/bin/env python3.2

import http.server
import traceback

class httphandler(http.server.BaseHTTPRequestHandler):
  def content_aware_error_message_format(self,m):
    oldfmt='<p>Message: %(message)s.\n'
    if oldfmt in self.error_message_format:
      # use <pre> if the message contains a newline internally
      # otherwise let the html control line breaks
      self.error_message_format=self.error_message_format.replace(oldfmt,'<p><pre>%(message)s</pre>\n') if '\n' in m else self.error_message_format.replace(oldfmt,'<p>%(message)s\n')
  def do_GET(self):
    try:
      assert(False)
    except:
      m=traceback.format_exc().strip()
      self.content_aware_error_message_format(m)
      self.send_error(500,m)

if __name__=='__main__':
  addr=('',9000)
  http.server.HTTPServer(addr,httphandler).serve_forever()
msg183100 - (view) Author: karl (karlcow) Date: 2013-02-27 00:05
Testing your code in Listing 1

→ curl -sI http://localhost:9000/
HTTP/1.0 501 Unsupported method ('HEAD')
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:38:32 GMT
Content-Type: text/html;charset=utf-8
Connection: close

So this is normal, http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.6.2

except that it would be better to use "501 Not Implemented" through the prose is optional. The content-type is also kind of useless. That would deserve to open another bug.


And

→ curl http://localhost:9000/
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:39:46 GMT
Content-Type: text/html;charset=utf-8
Connection: close

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 500</p>
        <p>Message: Traceback (most recent call last):
  File "server.py", line 9, in do_GET
    assert(False)
AssertionError
.</p>
        <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
</html>

OK. The server is answering with HTTP/1.0 and then a Traceback… which has nothing to do here.


We can see that in more details with a telnet

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Traceback (most recent call last):
  File "server.py", line 9, in do_GET
    assert(False)
AssertionError

Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:49:04 GMT
Content-Type: text/html;charset=utf-8
Connection: close

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 500</p>
        <p>Message: Traceback (most recent call last):
  File "server.py", line 9, in do_GET
    assert(False)
AssertionError
.</p>
        <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
</html>

Note that when not sending the traceback with the following code

#!/usr/bin/env python3.3

import http.server
import traceback

class httphandler(http.server.BaseHTTPRequestHandler):
  def do_GET(self):
    try:
      assert(False)
    except:
      self.send_error(500)

if __name__=='__main__':
  addr=('',9000)
  http.server.HTTPServer(addr,httphandler).serve_forever()

Everything is working well.

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Internal Server Error
Server: BaseHTTP/0.6 Python/3.3.0
Date: Tue, 26 Feb 2013 23:51:46 GMT
Content-Type: text/html;charset=utf-8
Connection: close

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 500</p>
        <p>Message: Internal Server Error.</p>
        <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
</html>
Connection closed by foreign host.

I'm looking at http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

For the second part of your message. I don't think the two issues should be mixed. Maybe open another bug report.
msg183101 - (view) Author: karl (karlcow) Date: 2013-02-27 00:10
OK I had understand a bit better. 

self.send_error(code, msg) is used for

* The body
* The HTTP header
* and the log

That's bad, very bad.

I do not think it should be used for the HTTP Header at all.
msg183102 - (view) Author: karl (karlcow) Date: 2013-02-27 00:27
ok I modify the code of server.py so that the server doesn't send the private message but the one which is already assigned by the library as it should. If there is a need for customization, there should be two separate variables, but which could lead to the same issues.

After modifications this is what I get.

→ telnet localhost 9000
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
Host: localhost:9000

HTTP/1.0 500 Internal Server Error
Server: BaseHTTP/0.6 Python/3.3.0
Date: Wed, 27 Feb 2013 00:21:21 GMT
Content-Type: text/html;charset=utf-8
Connection: close

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd">
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
        <title>Error response</title>
    </head>
    <body>
        <h1>Error response</h1>
        <p>Error code: 500</p>
        <p>Message: Traceback (most recent call last):
  File "server.py", line 11, in do_GET
    assert(False)
AssertionError
.</p>
        <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
</html>
Connection closed by foreign host.


I joined the patch: server.issue12921.patch
msg183471 - (view) Author: karl (karlcow) Date: 2013-03-04 16:26
orsenthil,

I made a proper patch for it with hg diff. It is very short.
See issue-12921-2.patch
msg183472 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-04 16:29
Karl,
Ack. The patch looks good. I shall go about with testing + committing this..
msg183475 - (view) Author: karl (karlcow) Date: 2013-03-04 16:46
orsenthil, 

When you have done a patch for testing I would love to see it. I could not find a proper way of doing it. I'm eager to learn. Thanks.
msg183514 - (view) Author: Roundup Robot (python-dev) Date: 2013-03-05 09:23
New changeset c31d700dea8b by Senthil Kumaran in branch '2.7':
Fix Issue #12921: BaseHTTPServer's send_error should send the correct error
http://hg.python.org/cpython/rev/c31d700dea8b

New changeset 5126e62c60af by Senthil Kumaran in branch '3.2':
Fix Issue #12921: BaseHTTPServer's send_error should send the correct error
http://hg.python.org/cpython/rev/5126e62c60af

New changeset 5d76a4746d9d by Senthil Kumaran in branch '3.3':
Fix Issue #12921: BaseHTTPServer's send_error should send the correct error
http://hg.python.org/cpython/rev/5d76a4746d9d

New changeset b87792757ee8 by Senthil Kumaran in branch 'default':
Fix Issue #12921: BaseHTTPServer's send_error should send the correct error
http://hg.python.org/cpython/rev/b87792757ee8
msg183515 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-05 09:31
Karl - thanks for your telnet debugging session output. Helped realized the problem better. So I had been thinking that sending message is okay. But had not realized that same variable name was used and was causing problem.

I have gone ahead with the fix for now. And for the tests, I think, will be written very similar to test_header_buffering_of_send_error of test_header_bufferring_of_send_error - when send_error is sent and then error asserted.

(Usually, I try to change test and patch together and that's our protocol too. In this case, I only tested it manually, thinking that there is not test coverage for this portion (yet). But later when I peeked into tests, I saw that it would be written for this scenario). Will include and then close this report. The other suggestion of Paul can be included in another report and so the change of the error status to proper 501 Not Implemented.

Thanks.
msg183516 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-05 09:54
There is a test failure. which leads me to believe that "Error message" may probably be relied upon by some applications.
 

FAIL: test_header_length (test.test_httpservers.BaseHTTPRequestHandlerTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/export/home/buildbot/64bits/3.2.cea-indiana-amd64/build/Lib/test/test_httpservers.py", line 605, in test_header_length
    self.assertEqual(result[0], b'HTTP/1.1 400 Line too long\r\n')
AssertionError: b'HTTP/1.1 400 Bad Request\r\n' != b'HTTP/1.1 400 Line too long\r\n'



I think, I am going revert the change I made first.

Because, looking at documentation and usage, leads me to believe that we are bit assuming much from a misleading example.

send_error(code, message) - message here is optional, but when sent, the server is responsible for displaying it. 

It is *never* advertised that the traceback can be sent as a message.

However, I do think there a chance for improvement and that is the reason, I went ahead with the change.

The improvement can be send_error - response header could be the uniform shortmsg and response body can be the message that is sent by send_error(code, message).  Now, this will be improvement and it should be discussed in that aspect rather than as a fix for anything.

Going ahead with the revert. Sorry for the mistake.
msg183518 - (view) Author: Roundup Robot (python-dev) Date: 2013-03-05 10:25
New changeset 4e6c46d5f77d by Senthil Kumaran in branch '2.7':
Reverting the changeset c31d700dea8b made for Issue #12921
http://hg.python.org/cpython/rev/4e6c46d5f77d

New changeset 637d7c953b10 by Senthil Kumaran in branch '3.2':
Reverting the changeset 5126e62c60af made for Issue #12921
http://hg.python.org/cpython/rev/637d7c953b10

New changeset 84e7a7f6ddb8 by Senthil Kumaran in branch '3.3':
Reverting the changeset 5d76a4746d9d made for Issue #12921
http://hg.python.org/cpython/rev/84e7a7f6ddb8

New changeset b0890674bc21 by Senthil Kumaran in branch 'default':
Reverting the changeset b87792757ee8  made for Issue #12921
http://hg.python.org/cpython/rev/b0890674bc21
msg183520 - (view) Author: karl (karlcow) Date: 2013-03-05 11:57
The culprit is here 
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l320

That an application or a person decides to send another message is ok. Designer choice. That the library is sending something optional as a test seems more uncomfortable.

The list of codes for 4xx is declared at 
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.5

In HTTP, only the code is mandatory, the text is optional. But indeed you reveal a loophole, which means that someone might want to send a message for the body. That said, I still do not think it should lend in the header for reasons explained previously (serious breakage).

I will come up with something which is fixing the issue without breaking the existent. 

That said, do I open another bug for the test? The test should be fixed too. It should test 400 only. and not the full status-line.

The previous test is also an issue, testing 414. Because the prose in the spec is "URI Too Long" and not "HTTP/1.1 414 Request-URI Too Long"
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.5.12

Hmmm… I see all tests are like this. It should be fixed. Opening bug? Thought?
msg183522 - (view) Author: karl (karlcow) Date: 2013-03-05 12:22
Senthil,


I created another bug reports for the tests which are fragile.
http://bugs.python.org/issue17355
msg183527 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-05 13:40
Which I closed, sorry.

I agree it should be possible to control the text produced with the error code separately from the text in the body.  I prefer the specific text produced by our http server to the generic message, but I would only be -0 on changing this so that the default was the generic message and the specific message only went in the body.  Since changing that would change the behavior of existing code, this is an enhacement request and not a bug fix.  (That is, it may be a design bug, but those get fixed as enhancements :)

Either way, the httpserver tests will need to be modified to test the API enhancement, and the possible default text change, proposed here.
msg183532 - (view) Author: karl (karlcow) Date: 2013-03-05 14:54
hehe. No hard feelings. I still do not think it is a good idea to test the "error code" and its associated message in the same test. :)

For example, in RFC2616, 414 is defined as 

    414 Request-URI Too Long

and in the HTTP1.1bis (which will not get a new version number) because the goal of the work was to just clarify and not make incompatible changes, 414 is defined as 

    414 URI Too Long

which is fine because the message is optional. With the current tests, it will make it hard to modify :)
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l627

# More about this specific issue

Right now, send_error groks everything, which is not very good in terms of security and side effects. 
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

def send_error(self, code, message=None):

Then later on:
try:
   shortmsg, longmsg = self.responses[code]

* shortmsg is supposed to bewhat is written in the spec.
* longmsg  is specific to the python project. 

When the message is not defined it takes the shortmsg
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l421

but if defined it sends everything whatever it is
http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l428

Checking the status-line 
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.1.2

3.1.2. Status Line


   The first line of a response message is the status-line, consisting
   of the protocol version, a space (SP), the status code, another
   space, a possibly-empty textual phrase describing the status code,
   and ending with CRLF.

     status-line = HTTP-version SP status-code SP reason-phrase CRLF

   The status-code element is a 3-digit integer code describing the
   result of the server's attempt to understand and satisfy the client's
   corresponding request.  The rest of the response message is to be
   interpreted in light of the semantics defined for that status code.
   See Section 6 of [Part2] for information about the semantics of
   status codes, including the classes of status code (indicated by the
   first digit), the status codes defined by this specification,
   considerations for the definition of new status codes, and the IANA
   registry.

     status-code    = 3DIGIT

   The reason-phrase element exists for the sole purpose of providing a
   textual description associated with the numeric status code, mostly
   out of deference to earlier Internet application protocols that were
   more frequently used with interactive text clients.  A client SHOULD
   ignore the reason-phrase content.

     reason-phrase  = *( HTAB / SP / VCHAR / obs-text )


* So client SHOULD ignore the reason-phrase
* The reason-phrase can only contains 
** HTAB (horizontal tab)
** SP (space)
** VCHAR (any visible [USASCII] character)
** obs-text
*** As a convention, ABNF rule names prefixed with "obs-" denote "obsolete" grammar rules that appear for historical reasons.
*** obs-text = %x80-FF (range of characters)


A hacky way to mitigate the issue would be to 

1. extract the first line (stop after the first CRLF)
2. sanitize this line. (allow only "*(HTAB / SP / VCHAR / %x80-FF)" )
3. send only this.

Thoughts?
Honestly, I do not find very satisfying. ;) but at least it should not break everything.
msg183533 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-05 15:41
Your suggestion could probably be applied as a bug fix to maintenance releases, but is it worth the effort?  We don't generally try to protect programmers from themselves in Python.

On the other hand, there should clearly be a way to provide the 'explain' text as well as the message text.

In 3.4 we could add an 'explain' keyword to send_error.  If only explain is specified, we'd use the shortmsg for message.  That way, a programmer writing new code can make it work the way they want it to, while existing code will continue to work the way it used to.
msg183535 - (view) Author: karl (karlcow) Date: 2013-03-05 17:05
R.david

Trying another patch just for understanding if it's what you meant.

What it does:

1. adding an 'explain' keyword
2. escaping the explain message for the body
3. checking for injection of newline in the reason phrase.

For the part 3, TODO:
check if there are other parts in the library if the reason phrase is injected in the message.
msg184132 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-14 04:42
Karl - I reviewed the patch and like it. Here are some comments.

At first, I did not see the need for both "message" and "explain" in the API both having almost similar purposes. But given that "explain" is already used by send_error and at the moment, un-customizable and I think it is barely useful. Your 3rd patch may improve it's utility value a bit further.

One review comment:

-                   {'code': code, 'message': _quote_html(message), 'explain': explain})
+                   {'code': code, 'message': message, 'explain': _quote_html(explain)})

I would go with _quote_html(message) too. That was fix for XSS security issue.

Also. I will not confuse the newline checking with status line in the same patch /commit. I have a suspicion that RFC says Status Line should be a single line and you went ahead with including that in patch. Is that correct? If you referred to any specific section of RFC, could you point me to that? Thanks!
msg184231 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013-03-15 14:53
I have added explain argument in 3.4, committed changeset 82669:59292f366b53

I think the short message no trailing line assertion should be taken by next. Either a new issue it if already open or create one just for the purpose.

Thanks Karl for the patch.
msg184232 - (view) Author: karl (karlcow) Date: 2013-03-15 15:00
Thanks! Yes we should open a separate issue. I will look at the current patch and see what I can propose for the next step.
msg184233 - (view) Author: karl (karlcow) Date: 2013-03-15 15:03
just a minor issue you made a mistake in the commit message. The issue number is 12921
History
Date User Action Args
2013-03-15 15:03:04karlcowsetmessages: + msg184233
2013-03-15 15:00:42karlcowsetmessages: + msg184232
2013-03-15 14:53:13orsenthilsetstatus: open -> closed
title: http.server.BaseHTTPRequestHandler.send_error and trailing newline -> http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response
messages: + msg184231

resolution: fixed
stage: committed/rejected
2013-03-14 04:42:25orsenthilsetmessages: + msg184132
2013-03-05 17:05:31karlcowsetfiles: + issue-12921-3.patch

messages: + msg183535
2013-03-05 15:41:28r.david.murraysetmessages: + msg183533
2013-03-05 14:54:31karlcowsetmessages: + msg183532
2013-03-05 13:40:31r.david.murraysetversions: - Python 2.7, Python 3.2, Python 3.3
nosy: + r.david.murray

messages: + msg183527

type: behavior -> enhancement
2013-03-05 13:35:23r.david.murraylinkissue17355 superseder
2013-03-05 12:22:42karlcowsetmessages: + msg183522
2013-03-05 11:57:57karlcowsetmessages: + msg183520
2013-03-05 10:25:47python-devsetmessages: + msg183518
2013-03-05 09:54:15orsenthilsetmessages: + msg183516
2013-03-05 09:31:08orsenthilsetmessages: + msg183515
versions: + Python 2.7, Python 3.3, Python 3.4
2013-03-05 09:24:00python-devsetnosy: + python-dev
messages: + msg183514
2013-03-04 16:46:27karlcowsetmessages: + msg183475
2013-03-04 16:29:05orsenthilsetmessages: + msg183472
2013-03-04 16:26:25karlcowsetfiles: + issue-12921-2.patch

messages: + msg183471
2013-02-27 00:27:02karlcowsetfiles: + server.issue12921.patch
keywords: + patch
messages: + msg183102
2013-02-27 00:10:20karlcowsetmessages: + msg183101
2013-02-27 00:05:13karlcowsetnosy: + karlcow
messages: + msg183100
2011-09-06 23:17:00orsenthilsetassignee: orsenthil

nosy: + orsenthil
2011-09-06 18:41:49Paul.Upchurchcreate