classification
Title: wsgi.simple_server's wsgi.input read/readline waits forever in certain circumstances
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: X-Istence, buzdelabuz2, pje, rschoon, tzickel
Priority: normal Keywords:

Created on 2014-06-28 06:35 by rschoon, last changed 2017-06-02 20:21 by buzdelabuz2.

Messages (5)
msg221774 - (view) Author: Robin Schoonover (rschoon) * Date: 2014-06-28 06:35
In the reference WSGI server in wsgiref.simple_server, wsgi.input's readline() hangs if the request body does not actually contain any
newlines.

Consider the following (slightly silly) example:

    from wsgiref.simple_server import make_server

    def app(environ, start_response):
        result = environ['wsgi.input'].readline()

        # not reached...
        start_response("200 OK", [("Content-Type", "text/plain")])
        return []

    httpd = make_server('', 8000, app)
    httpd.serve_forever()

And the following also silly request (the data kwarg makes it a
POST request):

    from urllib.request import urlopen

    req = urlopen("http://localhost:8000/", data=b'some bytes')
    print(req)

Normally this isn't a problem, as the reference server isn't intended
for production, and typically the only reason .readline() would be
used is with a request body formatted as multipart/form-data, which
uses ample newlines, including with the content boundaries.  However,
for other types of request bodies (such as application/x-www-form-urlencoded)
newlines often wouldn't appear, and using .readline() would wait forever for new input.
msg221814 - (view) Author: Robin Schoonover (rschoon) * Date: 2014-06-28 19:09
Issue also occurs if .read() is used with no size.
msg260528 - (view) Author: (tzickel) * Date: 2016-02-19 18:58
Just encountered this issue as well.

It's not related to newlines, but to not supporting HTTP or persistent connections (the wsgi.input is the socket's I/O directly, and if the client serves a persistent connection, then the .read() will block forever).

A simple solution is to use a saner wsgi server (gevent works nicely).
Here is their implmentation of the socket I/O wrapper class (Input), and it's read/readlines functions:
https://github.com/gevent/gevent/blob/a65501a1270c1763e9de336a9c3cf52081223ff6/gevent/pywsgi.py#L303
msg277191 - (view) Author: Bert JW Regeer (X-Istence) * Date: 2016-09-22 05:52
This is still very much an issue, and makes it more difficult to write generic python request/response libraries because we can't assume that a read() will return, and relying on the Content-Length being set is not always possible unfortunately.
msg295052 - (view) Author: Dom Cote (buzdelabuz2) Date: 2017-06-02 20:21
Just bumped into this issue today using bobo.

On the first attempt to load a page, it's OK, because there is something to read. But if you hit the "reload" button on the browser, for some reason, it will connect with the server a second time after the request is completed, and but there nothing is being sent, so readline() never comes back.

However, the documentation says that if the underlying object is set as non-blocking, then it shouldn't block.

So I first inspected the timeout value on the request's socket, and it comes back 0.0, which according to the sockets doc, should mean non blocking. That's weird.

So I decided to go ahead and call the setblocking(False) on the socket anyway, and this time, readline() came back with no data. The rest took care of itself.

This is my debug traces as well as a small patch to show the workaround.

Notice how the timeout is comes back as 0.0 despite the fact that the socket will block.

Also, notice the second connection request being '' after putting in the fix.

==============

> <socket.socket fd=5, family=AddressFamily.AF_INET, type=2049, proto=0, laddr=('192.168.1.113', 8085), raddr=('192.168.1.6', 59194)> 0.0
7> b'GET / HTTP/1.1\r\n'
192.168.1.6 - - [02/Jun/2017 16:01:39] "GET / HTTP/1.1" 200 690
5> <socket.socket fd=5, family=AddressFamily.AF_INET, type=2049, proto=0, laddr=('192.168.1.113', 8085), raddr=('192.168.1.6', 59195)> 0.0
6> <socket.socket fd=5, family=AddressFamily.AF_INET, type=2049, proto=0, laddr=('192.168.1.113', 8085), raddr=('192.168.1.6', 59195)> 0.0
7> b''

diff --git a/simple_server.py b/simple_server.py
index 7fddbe8..3df4ffa 100644
--- a/simple_server.py
+++ b/simple_server.py
@@ -115,9 +115,13 @@ class WSGIRequestHandler(BaseHTTPRequestHandler):

     def handle(self):
         """Handle a single HTTP request"""
-
+        print("5>",self.connection,self.connection.gettimeout())
+        self.connection.setblocking(False)
         self.raw_requestline = self.rfile.readline(65537)
-        if len(self.raw_requestline) > 65536:
+        print("6>",self.connection,self.connection.gettimeout())
+        print("7>",str(self.raw_requestline))
+
+        if False and len(self.raw_requestline) > 65536:
             self.requestline = ''
             self.request_version = ''
             self.command = ''
History
Date User Action Args
2017-06-02 20:21:09buzdelabuz2setnosy: + buzdelabuz2
messages: + msg295052
2016-09-22 05:52:39X-Istencesetnosy: + X-Istence

messages: + msg277191
versions: + Python 2.7, Python 3.3, Python 3.4, Python 3.5
2016-02-19 18:58:06tzickelsetnosy: + tzickel
messages: + msg260528
2014-06-30 00:03:50berker.peksagsetnosy: + pje
2014-06-28 19:09:36rschoonsetmessages: + msg221814
title: wsgi.simple_server's wsgi.input readline waits forever for non-multipart/form-data -> wsgi.simple_server's wsgi.input read/readline waits forever in certain circumstances
2014-06-28 06:35:01rschooncreate