Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web.py wsgiserver3.py raises TypeError when CSS file is not found #77844

Closed
lepeuvedic mannequin opened this issue May 27, 2018 · 10 comments
Closed

Web.py wsgiserver3.py raises TypeError when CSS file is not found #77844

lepeuvedic mannequin opened this issue May 27, 2018 · 10 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@lepeuvedic
Copy link
Mannequin

lepeuvedic mannequin commented May 27, 2018

BPO 33663
Nosy @orsenthil, @pitrou, @serhiy-storchaka, @zooba, @miss-islington, @lepeuvedic, @ValeriyaSinevich
PRs
  • bpo-33663: Convert content length to string before putting to header #7754
  • [3.7] bpo-33663: Convert content length to string before putting to header (GH-7754) #7781
  • [3.6] bpo-33663: Convert content length to string before putting to header (GH-7754) #7782
  • Files
  • server.py: http/server.py from Python library
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-06-19.14:45:10.995>
    created_at = <Date 2018-05-27.23:03:22.376>
    labels = ['3.7', '3.8', 'type-bug', 'library', 'easy']
    title = 'Web.py wsgiserver3.py raises TypeError when CSS file is not found'
    updated_at = <Date 2018-06-19.14:45:10.994>
    user = 'https://github.com/lepeuvedic'

    bugs.python.org fields:

    activity = <Date 2018-06-19.14:45:10.994>
    actor = 'steve.dower'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-06-19.14:45:10.995>
    closer = 'steve.dower'
    components = ['Library (Lib)']
    creation = <Date 2018-05-27.23:03:22.376>
    creator = 'jmlp'
    dependencies = []
    files = ['47616']
    hgrepos = []
    issue_num = 33663
    keywords = ['patch', 'easy']
    message_count = 10.0
    messages = ['317813', '318489', '318500', '319770', '319773', '319775', '319915', '319916', '319917', '319973']
    nosy_count = 7.0
    nosy_names = ['orsenthil', 'pitrou', 'serhiy.storchaka', 'steve.dower', 'miss-islington', 'jmlp', 'valer']
    pr_nums = ['7754', '7781', '7782']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue33663'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @lepeuvedic
    Copy link
    Mannequin Author

    lepeuvedic mannequin commented May 27, 2018

    When running the built-in web server of web.py, the following error messages appear when the HTTP client fetches a non existing CSS file:

    TypeError('WSGI response header value 469 is not of type str.',)
    Traceback (most recent call last):
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 1089, in communicate
        req.respond()
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 877, in respond
        self.server.gateway(self).respond()
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 1982, in respond
        for chunk in response:
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/httpserver.py", line 267, in __iter__
        self.start_response(self.status, self.headers)
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/httpserver.py", line 320, in xstart_response
        out = start_response(status, response_headers, *args)
      File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 2029, in start_response
        "WSGI response header value %r is not of type str." % v)
    TypeError: WSGI response header value 469 is not of type str.

    The faulty header is added by Python library, http/server.py.
    Error added in version 3.4 according to comments.

    Lines 467-471 in the attached file:
    body = content.encode('UTF-8', 'replace')
    self.send_header("Content-Type", self.error_content_type)
    self.send_header('Content-Length', int(len(body)))
    self.end_headers()

    The value for 'Content-Length' is passed as an 'int', but only a 'str' is acceptable.

    In the latest revision of 'server.py', the same code appears line 453.
    A possible correction is :

                body = content.encode('UTF-8', 'replace')
                self.send_header("Content-Type", self.error_content_type)
                self.send_header('Content-Length', str(int(len(body))))
            self.end_headers()

    @lepeuvedic lepeuvedic mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 27, 2018
    @ned-deily ned-deily added easy 3.7 (EOL) end of life 3.8 only security fixes labels Jun 1, 2018
    @serhiy-storchaka
    Copy link
    Member

    This code was added in bpo-16088. Taking into account all other uses of send_header(), and that len() always returns an integer, seems "int" should be replaced with "str".

    @lepeuvedic
    Copy link
    Mannequin Author

    lepeuvedic mannequin commented Jun 2, 2018

    The exception is raised in the start_response function provided by web.py's WSGIGateway class in wsgiserver3.py:1997.

    # According to PEP-3333, when using Python 3, the response status
    # and headers must be bytes masquerading as unicode; that is, they
    # must be of type "str" but are restricted to code points in the
    # "latin-1" set.

    Therefore, header values must be strings whenever start_response is called. WSGI servers must accumulate headers in some data structure and must call the supplied "start_response" function, when they have gathered all the headers and converted all the values to strings.

    The fault I observed is not strictly speaking caused by a bug in Python lib "server.py". Rather, it is a component interaction failure caused by inadequately defined semantics. The interaction between web.py and server.py is quite complex, and no component is faulty when considered alone.

    I explain:

    Response and headers management in server.py is handled by 3 methods of class BaseHTTPRequestHandler:

    • send_response : puts response in buffer
    • send_header : converts to string and adds to buffer
      ("%s: %s\r\n" % (keyword, value)).encode('latin-1', 'strict'))
    • end_headers : flushes buffer to socket

    This implementation is correct even if send_header is called with an
    int value.

    Now, web.py's application.py defines a "wsgi(env, start_resp)" function, which gets plugged into the CherryPy WSGI HTTP server.

    The server is an instance of class wsgiserver.CherryPyWSGIServer created in httpserver.py:169 (digging deeper, actually at line 195).
    This server is implemented as a HTTPServer configured to use gateways of type class WSGIGateway_10 to handle requests.

    A gateway is basically an instance of class initialized with a HTTPRequest instance, that has a "respond" method. Of course the WSGIGateway implements "respond" as described in the WSGI standard: it calls the WSGI-compliant web app, which is a function(environ, start_response(status, headers)) returning an iterator (for chunked HTTP responses). The start_response function provided by class WSGIGateway is where the failure occurs.

    When the application calls web.py's app.run(), the function runwsgi in web.py's wsgi.py get called. This function determines if it gets request via CGI or directly. In my case it starts a HTTP server using web.py's runsimple function (file httpserver.py:158).

    This function never returns, and runs the CherryPyWSGIServer, but it first wraps the wsgi function in two WGSI Middleware callables. Both are defined in web.py's httpserver.py file. The interesting one is StaticMiddleWare (line 281). Its role, is to hijack URLs starting with /static, as is the case with my missing CSS file. In order to serve those static resources quickly, its implementation uses StaticApp (a WSGI function serving static stuff, defined line 225), which extends Python's SimpleHTTPRequestHandler. That's where to two libraries connect.

    StaticApp changes the way headers are processed using overloaded methods for send_response, send_header and end_headers. This means that, when StaticApp calls SimpleHTTPRequestHandler.send_head() to send the HEAD part of the response, the headers are managed using the overloaded methods. When send_head() finds out that my CSS file does not exist and calls send_error() a Content-Length header gets written, but it is not converted to string, because the overloaded implementation just stores the header name and value in a list as they come.

    When it has finished gathering headers using Python's send_head(), it immediately calls start_response provided by WSGIGateway, where the failure occurs.

    The bug in Python is not strictly that send_header gets called with an int in send_error. Rather, it is a documentation bug which fails to mention that send_header/end_headers MUST CONVERT TO STRING and ENCODE IN LATIN-1.

    Therefore the correction I proposed is still invalid, because the combination of web.py and server.py after the correction, still does not properly encode the headers.

    As a conclusion I would say that:

    • In Python lib, the bug is a documentation bug, where documentation fails to indicate that send_headers and/or end_headers can receive header names or values which are not strings and not encoded in strict latin-1, and that it is their responsibility to do so.
    • In Web.py because the implementation of the overloaded methods fails to properly encode the headers.

    Of course, changing int to str does no harm and makes everything more resilient, but does not fix the underlying bug.

    @ValeriyaSinevich
    Copy link
    Mannequin

    ValeriyaSinevich mannequin commented Jun 16, 2018

    Hello!

    I created a PR for this but I am new to the process, so I don't know what to do with the error on "no news entry" issue. Could someone please help me with the next steps?

    @tirkarthi
    Copy link
    Member

    Congratulations! All the changes and patches are collected in Misc/NEWS.d file. You can find more information and the process to create one here : https://devguide.python.org/committing/?highlight=blurb#what-s-new-and-news-entries .

    Happy hacking :)

    @zooba
    Copy link
    Member

    zooba commented Jun 16, 2018

    For the NEWS entry, I'd suggest putting "Library" for the category and use your commit message for the text at the end.

    You should also fill out the CLA form posted in your PR by the bot. While we *can* overlook it for very simple changes, best to get it done anyway.

    As to Jean-Marc's comment, what you are describing is basically a bug in a third-party library and should be reported to them. As far as I can tell, http.server handles the encoding just fine, and if the web package has overridden part of it then it needs to override the rest in order to send headers properly. We can be less surprising to subclasses by only sending str, but if they're handling the conversion to bytes we can't really help any more than that.

    @zooba
    Copy link
    Member

    zooba commented Jun 18, 2018

    New changeset b36b0a3 by Steve Dower (ValeriyaSinevich) in branch 'master':
    bpo-33663: Convert content length to string before putting to header (GH-7754)
    b36b0a3

    @miss-islington
    Copy link
    Contributor

    New changeset 53d1e9f by Miss Islington (bot) in branch '3.7':
    bpo-33663: Convert content length to string before putting to header (GH-7754)
    53d1e9f

    @miss-islington
    Copy link
    Contributor

    New changeset 8c19a44 by Miss Islington (bot) in branch '3.6':
    bpo-33663: Convert content length to string before putting to header (GH-7754)
    8c19a44

    @zooba
    Copy link
    Member

    zooba commented Jun 19, 2018

    Congratulations on your first contribution, Valeriya! Let us know when you find something else you'd like to work on (you can add my name to the "Nosy List" on the bug and it'll get my attention).

    @zooba zooba closed this as completed Jun 19, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants