Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response #57130

Closed
PaulUpchurch mannequin opened this issue Sep 6, 2011 · 21 comments
Closed

http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response #57130

PaulUpchurch mannequin opened this issue Sep 6, 2011 · 21 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@PaulUpchurch
Copy link
Mannequin

PaulUpchurch mannequin commented Sep 6, 2011

BPO 12921
Nosy @orsenthil, @bitdancer, @karlcow
Files
  • server.issue12921.patch
  • issue-12921-2.patch
  • issue-12921-3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/orsenthil'
    closed_at = <Date 2013-03-15.14:53:13.629>
    created_at = <Date 2011-09-06.18:41:49.629>
    labels = ['type-feature', 'library']
    title = 'http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response'
    updated_at = <Date 2013-03-15.15:03:04.592>
    user = 'https://bugs.python.org/PaulUpchurch'

    bugs.python.org fields:

    activity = <Date 2013-03-15.15:03:04.592>
    actor = 'karlcow'
    assignee = 'orsenthil'
    closed = True
    closed_date = <Date 2013-03-15.14:53:13.629>
    closer = 'orsenthil'
    components = ['Library (Lib)']
    creation = <Date 2011-09-06.18:41:49.629>
    creator = 'Paul.Upchurch'
    dependencies = []
    files = ['29255', '29306', '29314']
    hgrepos = []
    issue_num = 12921
    keywords = ['patch']
    message_count = 21.0
    messages = ['143646', '183100', '183101', '183102', '183471', '183472', '183475', '183514', '183515', '183516', '183518', '183520', '183522', '183527', '183532', '183533', '183535', '184132', '184231', '184232', '184233']
    nosy_count = 5.0
    nosy_names = ['orsenthil', 'r.david.murray', 'karlcow', 'python-dev', 'Paul.Upchurch']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12921'
    versions = ['Python 3.4']

    @PaulUpchurch
    Copy link
    Mannequin Author

    PaulUpchurch mannequin commented Sep 6, 2011

    Calling http.server.BaseHTTPRequestHandler.send_error(code,message) with a message that contains a trailing newline does not display properly in Firefox, Chrome.

    Listing 1

    #!/usr/bin/env python3.2

    import http.server
    import traceback
    
    class httphandler(http.server.BaseHTTPRequestHandler):
      def do_GET(self):
        try:
          assert(False)
        except:
          self.send_error(500,traceback.format_exc())
    
    if __name__=='__main__':
      addr=('',9000)
      http.server.HTTPServer(addr,httphandler).serve_forever()

    Consider Listing 1. A typical programming idiom would be to wrap do_GET with a try/except block that reports an HTTP error with an HTML formatted stack trace. However, when I view this with Firefox and Chrome the error message is unformatted, i.e. raw HTML is displayed.

    A simple workaround is to call strip() on the message. This could be suggested to the user in the docs, or added as an automatic preprocessing feature to the library. With strip(), the message is formatted.

    Adding or documenting strip() resolves the bug. The remainder of this report is a feature request.

    The default error_message_format is probably not what people want for a stack trace. It leaves formatting of whitespace to the HTML which removes the newlines. This is desirable for short, unformatted messages but undesirable for a preformatted traceback. In Listing 2 I give a working solution and suggest including it in the library if the community feels that preformatted messages are common enough to warrant extra attention. I feel it is since "try/except: print traceback" is almost mandatory for error prone internet operations.

    Listing 2

    #!/usr/bin/env python3.2

    import http.server
    import traceback
    
    class httphandler(http.server.BaseHTTPRequestHandler):
      def content_aware_error_message_format(self,m):
        oldfmt='<p>Message: %(message)s.\n'
        if oldfmt in self.error_message_format:
          # use <pre> if the message contains a newline internally
          # otherwise let the html control line breaks
          self.error_message_format=self.error_message_format.replace(oldfmt,'<p><pre>%(message)s</pre>\n') if '\n' in m else self.error_message_format.replace(oldfmt,'<p>%(message)s\n')
      def do_GET(self):
        try:
          assert(False)
        except:
          m=traceback.format_exc().strip()
          self.content_aware_error_message_format(m)
          self.send_error(500,m)
    
    if __name__=='__main__':
      addr=('',9000)
      http.server.HTTPServer(addr,httphandler).serve_forever()

    @PaulUpchurch PaulUpchurch mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Sep 6, 2011
    @orsenthil orsenthil self-assigned this Sep 6, 2011
    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Feb 27, 2013

    Testing your code in Listing 1

    → curl -sI http://localhost:9000/
    HTTP/1.0 501 Unsupported method ('HEAD')
    Server: BaseHTTP/0.6 Python/3.3.0
    Date: Tue, 26 Feb 2013 23:38:32 GMT
    Content-Type: text/html;charset=utf-8
    Connection: close

    So this is normal, http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.6.2

    except that it would be better to use "501 Not Implemented" through the prose is optional. The content-type is also kind of useless. That would deserve to open another bug.

    And

    → curl http://localhost:9000/
    Server: BaseHTTP/0.6 Python/3.3.0
    Date: Tue, 26 Feb 2013 23:39:46 GMT
    Content-Type: text/html;charset=utf-8
    Connection: close

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd">
    <html>
        <head>
            <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
            <title>Error response</title>
        </head>
        <body>
            <h1>Error response</h1>
            <p>Error code: 500</p>
            <p>Message: Traceback (most recent call last):
      File "server.py", line 9, in do_GET
        assert(False)
    AssertionError
    .</p>
            <p>Error code explanation: 500 - Server got itself in trouble.</p>
        </body>
    </html>

    OK. The server is answering with HTTP/1.0 and then a Traceback… which has nothing to do here.

    We can see that in more details with a telnet

    → telnet localhost 9000
    Trying ::1...
    telnet: connect to address ::1: Connection refused
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: localhost:9000

    HTTP/1.0 500 Traceback (most recent call last):
      File "server.py", line 9, in do_GET
        assert(False)
    AssertionError

    Server: BaseHTTP/0.6 Python/3.3.0
    Date: Tue, 26 Feb 2013 23:49:04 GMT
    Content-Type: text/html;charset=utf-8
    Connection: close

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd">
    <html>
        <head>
            <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
            <title>Error response</title>
        </head>
        <body>
            <h1>Error response</h1>
            <p>Error code: 500</p>
            <p>Message: Traceback (most recent call last):
      File "server.py", line 9, in do_GET
        assert(False)
    AssertionError
    .</p>
            <p>Error code explanation: 500 - Server got itself in trouble.</p>
        </body>
    </html>

    Note that when not sending the traceback with the following code

    #!/usr/bin/env python3.3

    import http.server
    import traceback
    
    class httphandler(http.server.BaseHTTPRequestHandler):
      def do_GET(self):
        try:
          assert(False)
        except:
          self.send_error(500)
    
    if __name__=='__main__':
      addr=('',9000)
      http.server.HTTPServer(addr,httphandler).serve_forever()

    Everything is working well.

    → telnet localhost 9000
    Trying ::1...
    telnet: connect to address ::1: Connection refused
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: localhost:9000

    HTTP/1.0 500 Internal Server Error
    Server: BaseHTTP/0.6 Python/3.3.0
    Date: Tue, 26 Feb 2013 23:51:46 GMT
    Content-Type: text/html;charset=utf-8
    Connection: close

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd"\>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
    <title>Error response</title>
    </head>
    <body>
    <h1>Error response</h1>
    <p>Error code: 500</p>
    <p>Message: Internal Server Error.</p>
    <p>Error code explanation: 500 - Server got itself in trouble.</p>
    </body>
    </html>
    Connection closed by foreign host.

    I'm looking at http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

    For the second part of your message. I don't think the two issues should be mixed. Maybe open another bug report.

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Feb 27, 2013

    OK I had understand a bit better.

    self.send_error(code, msg) is used for

    • The body
    • The HTTP header
    • and the log

    That's bad, very bad.

    I do not think it should be used for the HTTP Header at all.

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Feb 27, 2013

    ok I modify the code of server.py so that the server doesn't send the private message but the one which is already assigned by the library as it should. If there is a need for customization, there should be two separate variables, but which could lead to the same issues.

    After modifications this is what I get.

    → telnet localhost 9000
    Trying ::1...
    telnet: connect to address ::1: Connection refused
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: localhost:9000

    HTTP/1.0 500 Internal Server Error
    Server: BaseHTTP/0.6 Python/3.3.0
    Date: Wed, 27 Feb 2013 00:21:21 GMT
    Content-Type: text/html;charset=utf-8
    Connection: close

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
            "http://www.w3.org/TR/html4/strict.dtd">
    <html>
        <head>
            <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
            <title>Error response</title>
        </head>
        <body>
            <h1>Error response</h1>
            <p>Error code: 500</p>
            <p>Message: Traceback (most recent call last):
      File "server.py", line 11, in do_GET
        assert(False)
    AssertionError
    .</p>
            <p>Error code explanation: 500 - Server got itself in trouble.</p>
        </body>
    </html>
    Connection closed by foreign host.

    I joined the patch: server.bpo-12921.patch

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 4, 2013

    orsenthil,

    I made a proper patch for it with hg diff. It is very short.
    See issue-12921-2.patch

    @orsenthil
    Copy link
    Member

    Karl,
    Ack. The patch looks good. I shall go about with testing + committing this..

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 4, 2013

    orsenthil,

    When you have done a patch for testing I would love to see it. I could not find a proper way of doing it. I'm eager to learn. Thanks.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 5, 2013

    New changeset c31d700dea8b by Senthil Kumaran in branch '2.7':
    Fix Issue bpo-12921: BaseHTTPServer's send_error should send the correct error
    http://hg.python.org/cpython/rev/c31d700dea8b

    New changeset 5126e62c60af by Senthil Kumaran in branch '3.2':
    Fix Issue bpo-12921: BaseHTTPServer's send_error should send the correct error
    http://hg.python.org/cpython/rev/5126e62c60af

    New changeset 5d76a4746d9d by Senthil Kumaran in branch '3.3':
    Fix Issue bpo-12921: BaseHTTPServer's send_error should send the correct error
    http://hg.python.org/cpython/rev/5d76a4746d9d

    New changeset b87792757ee8 by Senthil Kumaran in branch 'default':
    Fix Issue bpo-12921: BaseHTTPServer's send_error should send the correct error
    http://hg.python.org/cpython/rev/b87792757ee8

    @orsenthil
    Copy link
    Member

    Karl - thanks for your telnet debugging session output. Helped realized the problem better. So I had been thinking that sending message is okay. But had not realized that same variable name was used and was causing problem.

    I have gone ahead with the fix for now. And for the tests, I think, will be written very similar to test_header_buffering_of_send_error of test_header_bufferring_of_send_error - when send_error is sent and then error asserted.

    (Usually, I try to change test and patch together and that's our protocol too. In this case, I only tested it manually, thinking that there is not test coverage for this portion (yet). But later when I peeked into tests, I saw that it would be written for this scenario). Will include and then close this report. The other suggestion of Paul can be included in another report and so the change of the error status to proper 501 Not Implemented.

    Thanks.

    @orsenthil
    Copy link
    Member

    There is a test failure. which leads me to believe that "Error message" may probably be relied upon by some applications.

    FAIL: test_header_length (test.test_httpservers.BaseHTTPRequestHandlerTestCase)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/export/home/buildbot/64bits/3.2.cea-indiana-amd64/build/Lib/test/test_httpservers.py", line 605, in test_header_length
        self.assertEqual(result[0], b'HTTP/1.1 400 Line too long\r\n')
    AssertionError: b'HTTP/1.1 400 Bad Request\r\n' != b'HTTP/1.1 400 Line too long\r\n'

    I think, I am going revert the change I made first.

    Because, looking at documentation and usage, leads me to believe that we are bit assuming much from a misleading example.

    send_error(code, message) - message here is optional, but when sent, the server is responsible for displaying it.

    It is *never* advertised that the traceback can be sent as a message.

    However, I do think there a chance for improvement and that is the reason, I went ahead with the change.

    The improvement can be send_error - response header could be the uniform shortmsg and response body can be the message that is sent by send_error(code, message). Now, this will be improvement and it should be discussed in that aspect rather than as a fix for anything.

    Going ahead with the revert. Sorry for the mistake.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 5, 2013

    New changeset 4e6c46d5f77d by Senthil Kumaran in branch '2.7':
    Reverting the changeset c31d700dea8b made for Issue bpo-12921
    http://hg.python.org/cpython/rev/4e6c46d5f77d

    New changeset 637d7c953b10 by Senthil Kumaran in branch '3.2':
    Reverting the changeset 5126e62c60af made for Issue bpo-12921
    http://hg.python.org/cpython/rev/637d7c953b10

    New changeset 84e7a7f6ddb8 by Senthil Kumaran in branch '3.3':
    Reverting the changeset 5d76a4746d9d made for Issue bpo-12921
    http://hg.python.org/cpython/rev/84e7a7f6ddb8

    New changeset b0890674bc21 by Senthil Kumaran in branch 'default':
    Reverting the changeset b87792757ee8 made for Issue bpo-12921
    http://hg.python.org/cpython/rev/b0890674bc21

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 5, 2013

    The culprit is here
    http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l320

    That an application or a person decides to send another message is ok. Designer choice. That the library is sending something optional as a test seems more uncomfortable.

    The list of codes for 4xx is declared at
    http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.5

    In HTTP, only the code is mandatory, the text is optional. But indeed you reveal a loophole, which means that someone might want to send a message for the body. That said, I still do not think it should lend in the header for reasons explained previously (serious breakage).

    I will come up with something which is fixing the issue without breaking the existent.

    That said, do I open another bug for the test? The test should be fixed too. It should test 400 only. and not the full status-line.

    The previous test is also an issue, testing 414. Because the prose in the spec is "URI Too Long" and not "HTTP/1.1 414 Request-URI Too Long"
    http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-22#section-6.5.12

    Hmmm… I see all tests are like this. It should be fixed. Opening bug? Thought?

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 5, 2013

    Senthil,

    I created another bug reports for the tests which are fragile.
    http://bugs.python.org/issue17355

    @bitdancer
    Copy link
    Member

    Which I closed, sorry.

    I agree it should be possible to control the text produced with the error code separately from the text in the body. I prefer the specific text produced by our http server to the generic message, but I would only be -0 on changing this so that the default was the generic message and the specific message only went in the body. Since changing that would change the behavior of existing code, this is an enhacement request and not a bug fix. (That is, it may be a design bug, but those get fixed as enhancements :)

    Either way, the httpserver tests will need to be modified to test the API enhancement, and the possible default text change, proposed here.

    @bitdancer bitdancer added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Mar 5, 2013
    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 5, 2013

    hehe. No hard feelings. I still do not think it is a good idea to test the "error code" and its associated message in the same test. :)

    For example, in RFC2616, 414 is defined as

    414 Request-URI Too Long
    

    and in the HTTP1.1bis (which will not get a new version number) because the goal of the work was to just clarify and not make incompatible changes, 414 is defined as

    414 URI Too Long
    

    which is fine because the message is optional. With the current tests, it will make it hard to modify :)
    http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l627

    # More about this specific issue

    Right now, send_error groks everything, which is not very good in terms of security and side effects.
    http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l404

    def send_error(self, code, message=None):

    Then later on:
    try:
    shortmsg, longmsg = self.responses[code]

    • shortmsg is supposed to bewhat is written in the spec.
    • longmsg is specific to the python project.

    When the message is not defined it takes the shortmsg
    http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l421

    but if defined it sends everything whatever it is
    http://hg.python.org/cpython/file/3.3/Lib/http/server.py#l428

    Checking the status-line
    http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.1.2

    3.1.2. Status Line

    The first line of a response message is the status-line, consisting
    of the protocol version, a space (SP), the status code, another
    space, a possibly-empty textual phrase describing the status code,
    and ending with CRLF.

     status-line = HTTP-version SP status-code SP reason-phrase CRLF
    

    The status-code element is a 3-digit integer code describing the
    result of the server's attempt to understand and satisfy the client's
    corresponding request. The rest of the response message is to be
    interpreted in light of the semantics defined for that status code.
    See Section 6 of [Part2] for information about the semantics of
    status codes, including the classes of status code (indicated by the
    first digit), the status codes defined by this specification,
    considerations for the definition of new status codes, and the IANA
    registry.

     status-code    = 3DIGIT
    

    The reason-phrase element exists for the sole purpose of providing a
    textual description associated with the numeric status code, mostly
    out of deference to earlier Internet application protocols that were
    more frequently used with interactive text clients. A client SHOULD
    ignore the reason-phrase content.

     reason-phrase  = *( HTAB / SP / VCHAR / obs-text )
    
    • So client SHOULD ignore the reason-phrase
    • The reason-phrase can only contains
      ** HTAB (horizontal tab)
      ** SP (space)
      ** VCHAR (any visible [USASCII] character)
      ** obs-text
      *** As a convention, ABNF rule names prefixed with "obs-" denote "obsolete" grammar rules that appear for historical reasons.
      *** obs-text = %x80-FF (range of characters)

    A hacky way to mitigate the issue would be to

    1. extract the first line (stop after the first CRLF)
    2. sanitize this line. (allow only "*(HTAB / SP / VCHAR / %x80-FF)" )
    3. send only this.

    Thoughts?
    Honestly, I do not find very satisfying. ;) but at least it should not break everything.

    @bitdancer
    Copy link
    Member

    Your suggestion could probably be applied as a bug fix to maintenance releases, but is it worth the effort? We don't generally try to protect programmers from themselves in Python.

    On the other hand, there should clearly be a way to provide the 'explain' text as well as the message text.

    In 3.4 we could add an 'explain' keyword to send_error. If only explain is specified, we'd use the shortmsg for message. That way, a programmer writing new code can make it work the way they want it to, while existing code will continue to work the way it used to.

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 5, 2013

    R.david

    Trying another patch just for understanding if it's what you meant.

    What it does:

    1. adding an 'explain' keyword
    2. escaping the explain message for the body
    3. checking for injection of newline in the reason phrase.

    For the part 3, TODO:
    check if there are other parts in the library if the reason phrase is injected in the message.

    @orsenthil
    Copy link
    Member

    Karl - I reviewed the patch and like it. Here are some comments.

    At first, I did not see the need for both "message" and "explain" in the API both having almost similar purposes. But given that "explain" is already used by send_error and at the moment, un-customizable and I think it is barely useful. Your 3rd patch may improve it's utility value a bit further.

    One review comment:

    •               {'code': code, 'message': \_quote_html(message), 'explain': explain})
      

    + {'code': code, 'message': message, 'explain': _quote_html(explain)})

    I would go with _quote_html(message) too. That was fix for XSS security issue.

    Also. I will not confuse the newline checking with status line in the same patch /commit. I have a suspicion that RFC says Status Line should be a single line and you went ahead with including that in patch. Is that correct? If you referred to any specific section of RFC, could you point me to that? Thanks!

    @orsenthil
    Copy link
    Member

    I have added explain argument in 3.4, committed changeset 82669:59292f366b53

    I think the short message no trailing line assertion should be taken by next. Either a new issue it if already open or create one just for the purpose.

    Thanks Karl for the patch.

    @orsenthil orsenthil changed the title http.server.BaseHTTPRequestHandler.send_error and trailing newline http.server.BaseHTTPRequestHandler.send_error , ability send a detailed response Mar 15, 2013
    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 15, 2013

    Thanks! Yes we should open a separate issue. I will look at the current patch and see what I can propose for the next step.

    @karlcow
    Copy link
    Mannequin

    karlcow mannequin commented Mar 15, 2013

    just a minor issue you made a mistake in the commit message. The issue number is 12921

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants