classification
Title: Web.py wsgiserver3.py raises TypeError when CSS file is not found
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: jmlp, miss-islington, orsenthil, pitrou, serhiy.storchaka, steve.dower, valer
Priority: normal Keywords: easy, patch

Created on 2018-05-27 23:03 by jmlp, last changed 2018-06-19 14:45 by steve.dower. This issue is now closed.

Files
File name Uploaded Description Edit
server.py jmlp, 2018-05-27 23:03 http/server.py from Python library
Pull Requests
URL Status Linked Edit
PR 7754 merged valer, 2018-06-16 18:14
PR 7781 merged miss-islington, 2018-06-18 21:19
PR 7782 merged miss-islington, 2018-06-18 21:20
Messages (10)
msg317813 - (view) Author: Jean-Marc Le Peuvedic (jmlp) * Date: 2018-05-27 23:03
When running the built-in web server of web.py, the following error messages appear when the HTTP client fetches a non existing CSS file:

TypeError('WSGI response header value 469 is not of type str.',)
Traceback (most recent call last):
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 1089, in communicate
    req.respond()
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 877, in respond
    self.server.gateway(self).respond()
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 1982, in respond
    for chunk in response:
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/httpserver.py", line 267, in __iter__
    self.start_response(self.status, self.headers)
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/httpserver.py", line 320, in xstart_response
    out = start_response(status, response_headers, *args)
  File "/home/jm/miniconda3/envs/REST/lib/python3.6/site-packages/web/wsgiserver/wsgiserver3.py", line 2029, in start_response
    "WSGI response header value %r is not of type str." % v)
TypeError: WSGI response header value 469 is not of type str.

The faulty header is added by Python library, http/server.py. 
Error added in version 3.4 according to comments.

Lines 467-471 in the attached file:
            body = content.encode('UTF-8', 'replace')
            self.send_header("Content-Type", self.error_content_type)
            self.send_header('Content-Length', int(len(body)))
        self.end_headers()

The value for 'Content-Length' is passed as an 'int', but only a 'str' is acceptable.

In the latest revision of 'server.py', the same code appears line 453.
A possible correction is :

            body = content.encode('UTF-8', 'replace')
            self.send_header("Content-Type", self.error_content_type)
            self.send_header('Content-Length', str(int(len(body))))
        self.end_headers()
msg318489 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-06-02 04:45
This code was added in issue16088. Taking into account all other uses of send_header(), and that len() always returns an integer, seems "int" should be replaced with "str".
msg318500 - (view) Author: Jean-Marc Le Peuvedic (jmlp) * Date: 2018-06-02 15:10
The exception is raised in the start_response function provided by web.py's WSGIGateway class in wsgiserver3.py:1997.

# According to PEP 3333, when using Python 3, the response status
# and headers must be bytes masquerading as unicode; that is, they
# must be of type "str" but are restricted to code points in the
# "latin-1" set.

Therefore, header values must be strings whenever start_response is called. WSGI servers must accumulate headers in some data structure and must call the supplied "start_response" function, when they have gathered all the headers and converted all the values to strings.

The fault I observed is not strictly speaking caused by a bug in Python lib "server.py". Rather, it is a component interaction failure caused by inadequately defined semantics. The interaction between web.py and server.py is quite complex, and no component is faulty when considered alone.

I explain:

Response and headers management in server.py is handled by 3 methods of class BaseHTTPRequestHandler:
- send_response : puts response in buffer
- send_header : converts to string and adds to buffer
    ("%s: %s\r\n" % (keyword, value)).encode('latin-1', 'strict'))
- end_headers : flushes buffer to socket

This implementation is correct even if send_header is called with an
int value.

Now, web.py's application.py defines a "wsgi(env, start_resp)" function, which gets plugged into the CherryPy WSGI HTTP server.

The server is an instance of class wsgiserver.CherryPyWSGIServer created in httpserver.py:169 (digging deeper, actually at line 195).
This server is implemented as a HTTPServer configured to use gateways of type class WSGIGateway_10 to handle requests.

A gateway is basically an instance of class initialized with a HTTPRequest instance, that has a "respond" method. Of course the WSGIGateway implements "respond" as described in the WSGI standard: it calls the WSGI-compliant web app, which is a function(environ, start_response(status, headers)) returning an iterator (for chunked HTTP responses). The start_response function provided by class WSGIGateway is where the failure occurs.

When the application calls web.py's app.run(), the function runwsgi in web.py's wsgi.py get called. This function determines if it gets request via CGI or directly. In my case it starts a HTTP server using web.py's runsimple function (file httpserver.py:158).

This function never returns, and runs the CherryPyWSGIServer, but it first wraps the wsgi function in two WGSI Middleware callables. Both are defined in web.py's httpserver.py file. The interesting one is StaticMiddleWare (line 281). Its role, is to hijack URLs starting with /static, as is the case with my missing CSS file. In order to serve those static resources quickly, its implementation uses StaticApp (a WSGI function serving static stuff, defined line 225), which extends Python's SimpleHTTPRequestHandler. That's where to two libraries connect.

StaticApp changes the way headers are processed using overloaded methods for send_response, send_header and end_headers. This means that, when StaticApp calls SimpleHTTPRequestHandler.send_head() to send the HEAD part of the response, the headers are managed using the overloaded methods. When send_head() finds out that my CSS file does not exist and calls send_error() a Content-Length header gets written, but it is not converted to string, because the overloaded implementation just stores the header name and value in a list as they come.

When it has finished gathering headers using Python's send_head(), it immediately calls start_response provided by WSGIGateway, where the failure occurs.

The bug in Python is not strictly that send_header gets called with an int in send_error. Rather, it is a documentation bug which fails to mention that send_header/end_headers MUST CONVERT TO STRING and ENCODE IN LATIN-1.

Therefore the correction I proposed is still invalid, because the combination of web.py and server.py after the correction, still does not properly encode the headers.

As a conclusion I would say that:
- In Python lib, the bug is a documentation bug, where documentation fails to indicate that send_headers and/or end_headers can receive header names or values which are not strings and not encoded in strict latin-1, and that it is their responsibility to do so.
- In Web.py because the implementation of the overloaded methods fails to properly encode the headers.

Of course, changing int to str does no harm and makes everything more resilient, but does not fix the underlying bug.
msg319770 - (view) Author: Valeriya Sinevich (valer) * Date: 2018-06-16 18:17
Hello!

I created a PR for this but I am new to the process, so I don't know what to do with the error on "no news entry" issue. Could someone please help me with the next steps?
msg319773 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2018-06-16 18:36
Congratulations! All the changes and patches are collected in Misc/NEWS.d file. You can find more information and the process to create one here : https://devguide.python.org/committing/?highlight=blurb#what-s-new-and-news-entries .

Happy hacking :)
msg319775 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-06-16 18:41
For the NEWS entry, I'd suggest putting "Library" for the category and use your commit message for the text at the end.

You should also fill out the CLA form posted in your PR by the bot. While we *can* overlook it for very simple changes, best to get it done anyway.

As to Jean-Marc's comment, what you are describing is basically a bug in a third-party library and should be reported to them. As far as I can tell, http.server handles the encoding just fine, and if the web package has overridden part of it then it needs to override the rest in order to send headers properly. We can be less surprising to subclasses by only sending str, but if they're handling the conversion to bytes we can't really help any more than that.
msg319915 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-06-18 21:18
New changeset b36b0a3765bcacb4dcdbf12060e9e99711855da8 by Steve Dower (ValeriyaSinevich) in branch 'master':
bpo-33663: Convert content length to string before putting to header (GH-7754)
https://github.com/python/cpython/commit/b36b0a3765bcacb4dcdbf12060e9e99711855da8
msg319916 - (view) Author: miss-islington (miss-islington) Date: 2018-06-18 21:38
New changeset 53d1e9fad316e1404535157fe21cab8919f707c9 by Miss Islington (bot) in branch '3.7':
bpo-33663: Convert content length to string before putting to header (GH-7754)
https://github.com/python/cpython/commit/53d1e9fad316e1404535157fe21cab8919f707c9
msg319917 - (view) Author: miss-islington (miss-islington) Date: 2018-06-18 21:41
New changeset 8c19a44b63033e9c70e9b552476baecb6e3e9451 by Miss Islington (bot) in branch '3.6':
bpo-33663: Convert content length to string before putting to header (GH-7754)
https://github.com/python/cpython/commit/8c19a44b63033e9c70e9b552476baecb6e3e9451
msg319973 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-06-19 14:45
Congratulations on your first contribution, Valeriya! Let us know when you find something else you'd like to work on (you can add my name to the "Nosy List" on the bug and it'll get my attention).
History
Date User Action Args
2018-06-19 14:45:10steve.dowersetstatus: open -> closed
resolution: fixed
messages: + msg319973

stage: patch review -> resolved
2018-06-18 21:41:10miss-islingtonsetmessages: + msg319917
2018-06-18 21:38:00miss-islingtonsetnosy: + miss-islington
messages: + msg319916
2018-06-18 21:20:04miss-islingtonsetpull_requests: + pull_request7388
2018-06-18 21:19:19miss-islingtonsetpull_requests: + pull_request7387
2018-06-18 21:18:03steve.dowersetmessages: + msg319915
2018-06-16 18:41:42steve.dowersetnosy: + steve.dower, - xtreak
messages: + msg319775
2018-06-16 18:36:06xtreaksetnosy: + xtreak
messages: + msg319773
2018-06-16 18:17:15valersetnosy: + valer
messages: + msg319770
2018-06-16 18:14:21valersetkeywords: + patch
stage: patch review
pull_requests: + pull_request7361
2018-06-02 15:10:40jmlpsetmessages: + msg318500
2018-06-02 04:45:36serhiy.storchakasetnosy: + serhiy.storchaka, pitrou
messages: + msg318489
2018-06-01 19:02:27ned.deilysetkeywords: + easy
nosy: + orsenthil

versions: + Python 3.7, Python 3.8
2018-05-27 23:03:22jmlpcreate