Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many servers (wsgiref, http.server, etc) can truncate large output blobs #68479

Closed
JonathanKamens mannequin opened this issue May 26, 2015 · 30 comments
Closed

Many servers (wsgiref, http.server, etc) can truncate large output blobs #68479

JonathanKamens mannequin opened this issue May 26, 2015 · 30 comments
Labels
stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error

Comments

@JonathanKamens
Copy link
Mannequin

JonathanKamens mannequin commented May 26, 2015

BPO 24291
Nosy @pjeby, @pitrou, @benjaminp, @tarekziade, @vadmium, @serhiy-storchaka, @jamadden
Files
  • wsgi-partial.py: Demo
  • wfile-partial.patch
  • wfile-partial.v2.patch
  • wfile-partial.v3.patch
  • wsgiref-only.patch: wsgiref fix only
  • wsgiref-only.v2.patch
  • wfile-partial.v4.patch
  • wfile-partial.v5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-27.22:16:08.493>
    created_at = <Date 2015-05-26.20:18:01.494>
    labels = ['type-bug', 'library', 'expert-IO']
    title = 'Many servers (wsgiref, http.server, etc) can truncate large output blobs'
    updated_at = <Date 2017-10-27.22:16:08.491>
    user = 'https://bugs.python.org/JonathanKamens'

    bugs.python.org fields:

    activity = <Date 2017-10-27.22:16:08.491>
    actor = 'martin.panter'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-27.22:16:08.493>
    closer = 'martin.panter'
    components = ['Library (Lib)', 'IO']
    creation = <Date 2015-05-26.20:18:01.494>
    creator = 'Jonathan Kamens'
    dependencies = []
    files = ['42411', '42412', '42427', '42870', '43066', '43143', '43233', '43840']
    hgrepos = []
    issue_num = 24291
    keywords = ['patch']
    message_count = 30.0
    messages = ['244131', '244138', '244489', '244505', '244506', '244550', '260844', '262830', '262831', '262833', '262834', '262835', '262836', '262837', '263079', '263085', '263087', '263163', '263789', '265528', '265615', '265691', '266750', '266926', '266986', '267097', '267393', '267406', '271066', '305140']
    nosy_count = 12.0
    nosy_names = ['pje', 'pitrou', 'benjamin.peterson', 'tarek', 'stutzbach', 'python-dev', 'martin.panter', 'serhiy.storchaka', 'jmadden', 'Jonathan Kamens', 'marcjofre', 'Paolo Veglia']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue24291'
    versions = ['Python 3.5']

    @JonathanKamens
    Copy link
    Mannequin Author

    JonathanKamens mannequin commented May 26, 2015

    The _write method of wsgiref.handlers.SimpleHandler reads as follows:

        def _write(self,data):
            self.stdout.write(data)

    The problem here is that calling write() on a socket is not actually guaranteed to write all of the data in the buffer. If the length of data is large enough, then the kernel will take only part of it. In that case, the rest of the response data will be silently discarded by wsgiref.

    _write needs to check the return value of self.stdout.write(data), and if it is less than the length of data, repeat the write from the middle of the data buffer, etc., until all the data has been written.

    @JonathanKamens JonathanKamens mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 26, 2015
    @vadmium
    Copy link
    Member

    vadmium commented May 26, 2015

    [UTF-8 error workaround]

    What kind of object is “stdout”? Plain Python socket objects don’t have a write() method.

    Perhaps “stdout” is meant to implement the BufferedIOBase.write() interface, which guarantees everything is written, even if it takes multiple raw calls. Wrapping your output stream in BufferedWriter might be a solution.

    @marcjofre
    Copy link
    Mannequin

    marcjofre mannequin commented May 30, 2015

    When executing, wsgiref produces the following error dump:
    D:\SixSensoCytometerWebBased\include\bin>D:\SixSensoCytometerWebBased\include\bi
    n\manage.exe runserver 8765
    Validating models...

    0 errors found
    May 26, 2015 - 16:20:56
    Django version 1.5.8, using settings 'settings'
    Development server is running at http://127.0.0.1:8765/
    Quit the server with CTRL-BREAK.
    [26/May/2015 16:20:56] "GET / HTTP/1.1" 200 29661
    100
    [26/May/2015 16:20:59] "POST /login HTTP/1.1" 200 25980
    [26/May/2015 16:20:59] "GET /static/js/ready.js HTTP/1.1" 500 79563
    Traceback (most recent call last):
      File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
    line 86, in run
        self.finish_response()
      File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
    line 128, in finish_response
        self.write(data)
      File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
    line 217, in write
        self._write(data)
      File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\socket.py", line 324,
    in write
    [26/May/2015 16:20:59] "GET /static/js/ClasePrincipalAutobahn.js HTTP/1.1" 500 7
    9818
    [26/May/2015 16:20:59] "GET /static/img/initialGUIimage.png HTTP/1.1" 500 79777
        self.flush()
      File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\socket.py", line 303,
    in flush
        self._sock.sendall(view[write_offset:write_offset+buffer_size])
    error: [Errno 10053] An established connection was aborted by the software in yo
    ur host machine

    @vadmium
    Copy link
    Member

    vadmium commented May 31, 2015

    Marc, if you think your connection aborted error is related, can you say what kind of object the “stdout” parameter is in the handler class? In other words, what class’s method is being called in the write() function at line 217?

    @vadmium
    Copy link
    Member

    vadmium commented May 31, 2015

    Actually, looking closer at Marc’s traceback, it appears to be the ordinary socket._fileobject class. So I suspect it is unrelated to this bug because the Python 2 socket code appears to call sendall() in a loop.

    @marcjofre
    Copy link
    Mannequin

    marcjofre mannequin commented May 31, 2015

    Hi all,

    I did more tests, identifying that the main issue I have resides in another script (multiprocessing), when using nuitka compiler.

    I will get back to you with more info when I solve the multiprocessing issue.

    The command prompt dump is:

    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x0000000004291A20>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x0000000004291B10>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042917C8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x0000000004291930>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042919A8>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x0000000004291930>
    data: HTTP/1.0 200 OK
    self.stdout: <socket._fileobject object at 0x00000000042954F8>
    data: HTTP/1.0 500 INTERNAL SERVER ERROR
    self.stdout: <socket._fileobject object at 0x00000000042954F8>

    @PaoloVeglia
    Copy link
    Mannequin

    PaoloVeglia mannequin commented Feb 25, 2016

    I had the same issue as Jonathan with Python 3.5.1, big chunks of data get truncated (to ~85KB).

    The problem disappears when splitting original data into smaller chunks:

        def _write(self,data):
            chunk_size = 1024
            for chunk in (data[i:i+chunk_size] for i in range(0, len(data), chunk_size)):
                self.stdout.write(chunk)

    @jamadden
    Copy link
    Mannequin

    jamadden mannequin commented Apr 3, 2016

    gevent has another simple reproducer for this. I do believe it's not gevent's fault, the fault is in the standard library; SimpleHandler._write needs to loop until sent += self.stdeout.write(data) is len(data).

    I have written up more on this at gevent/gevent#778 (comment) For convenience I'll reproduce the bulk of that comment here:

    The user supplied a django application that produced a very large response that was getting truncated when using gevent under Python 3.4. (I believe gevent's non-blocking sockets are simply running into a different buffering behaviour, making it more likely to be hit under those conditions simply because they are faster).

    This looks like a bug in the standard library's wsgiref implementation.

    I tracked this down to a call to socket.send(data). This method only sends whatever portion of the data it is possible to send at the time, and it returns the count of the data that was sent. The caller of socket.send() is responsible for looping to make sure the full len of the data is sent. This is clearly documented.

    In this case, there is a call to send trying to send the full response, but only a portion of it is able to be immediately written. Here's a transcript of the first request (I modified gevent's socket.send method to print how much data is actually sent each time):

    Django version 1.9.5, using settings 'gdc.settings'
    Starting development server at http://127.0.0.1:8000/
    Quit the server with CONTROL-C.
    SENDING 17
    SENT 17 OF 17
    SENDING 37
    SENT 37 OF 37
    SENDING 38
    SENT 38 OF 38
    SENDING 71
    SENT 71 OF 71
    SENDING 1757905
    SENT 555444 OF 1757905
    [03/Apr/2016 19:48:31] "GET / HTTP/1.1" 200 1757905
    

    Note that there's no retry on the short send.

    Here's the stack trace for that short send; we can clearly see that there is no retry loop in place:

      //3.4/lib/python3.4/wsgiref/handlers.py(138)run()
        136             self.setup_environ()
        137             self.result = application(self.environ, self.start_response)
    --> 138             self.finish_response()
        139         except:
        140             try:
    
      //3.4/lib/python3.4/wsgiref/handlers.py(180)finish_response()
        178             if not self.result_is_file() or not self.sendfile():
        179                 for data in self.result:
    --> 180                     self.write(data)
        181                 self.finish_content()
        182         finally:
    
      //3.4/lib/python3.4/wsgiref/handlers.py(279)write()
        277
        278         # XXX check Content-Length and truncate if too many bytes written?
    --> 279         self._write(data)
        280         self._flush()
        281
    
      //3.4/lib/python3.4/wsgiref/handlers.py(453)_write()
        451
        452     def _write(self,data):
    --> 453         self.stdout.write(data)
        454
        455     def _flush(self):
    
      //3.4/lib/python3.4/socket.py(398)write()
        396         self._checkWritable()
        397         try:
    --> 398             return self._sock.send(b)
        399         except error as e:
        400             # XXX what about EINTR?
    
    > //gevent/_socket3.py(384)send()
        382                 from IPython.core.debugger import Tracer; Tracer()() ## DEBUG ##
        383
    --> 384             return count
    

    self.stdout is an instance of socket.SocketIO (which is returned from socket.makefile). This is not documented on the web, but the docstring also clearly documents that callers of write should loop to make sure all data gets sent.

    @jamadden
    Copy link
    Mannequin

    jamadden mannequin commented Apr 3, 2016

    Django uses a wsgiref.simple_server to serve requests, which in turn uses socketserver.StreamRequestHandler to implement its WSGIRequestHandler. That base class explicitly turns off buffering for writes (wbufsize = 0 is the class default which gets passed to socket.makefile). So that explains how there's no BufferedWriter wrapped around the SocketIO instance.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 3, 2016

    Okay now I see the conflict. The use of WSGIRequestHandler with wbufsize = 0 was the missing key. I see two possible solutions:

    1. Change SimpleHandler._write() to allow self.stdout to be a RawIOBase writer, and loop over stdout.write() until all the data is written. This seems to be what most people are suggesting here. But it requires that stdout.write() returns the number of bytes written, so could be a compatibility problem.

    2. Document that the SimpleHandler(stdout=...) parameter should be a BufferedIOBase writer, and fix WSGIRequestHandler to pass in a BufferedWriter (by overriding the wbufsize attribute with -1 or io.DEFAULT_BUFFER_SIZE).

    I am not super familiar with the wsgiref package, but the second option seems more preferable to me and more in line with my understanding of how it was intended to work.

    @jamadden
    Copy link
    Mannequin

    jamadden mannequin commented Apr 3, 2016

    Is there an expected self.stdout implementation that doesn't return the number of bytes it writes? sys.stdout does, as does io.RawIOBase. It doesn't seem clear to me that practically speaking there's a compatibility problem with requiring that for self.stdout.

    @jamadden
    Copy link
    Mannequin

    jamadden mannequin commented Apr 3, 2016

    self.stdin and self.stderr are documented to be wsgi.input and wsgi.errors, which are both described as "file-like" objects, meaning that the write method should return bytes written. It seems like the same could reasonably be said to be true for self.stdout, though it isn't strictly documented as such.

    The WSGI spec says that each chunk the application yields should be written immediately, with no buffering (https://www.python.org/dev/peps/pep-3333/#buffering-and-streaming), so I don't think having the default output stream be buffered would be in compliance.

    If there is a compatibility problem, writing the loop this way could bypass it (untested):

      def _write(self, data):
        written = self.stdout.write(data)
        while written is not None and written < len(data):
            written += self.stdout.write(data[written:])

    @vadmium
    Copy link
    Member

    vadmium commented Apr 3, 2016

    My worry was that it is easy to make a write() method that does not return anything, but is still useful in most cases. Since BufferedIOBase.write() has to guarantee to write everything, it may not seem important to return a value. But we could explicitly check for None as you suggested.

    In the BaseHandler class, each chunk yielded by the application is passed to BaseHandler.write(). But that method calls self._flush(), which should avoid any buffering problem. SimpleHandler._flush() implements this by calling self.stdout.flush().

    @jamadden
    Copy link
    Mannequin

    jamadden mannequin commented Apr 4, 2016

    I'm sorry, I'm still not following the argument that write is likely to return nothing. RawIOBase and BufferedIOBase both document that write must return the number of written bytes; if you don't return that, you break anything that assumes you do, as documented (because both patterns for checking if you need to keep looping are both TypeErrors: written = 0; written += None and None < len(data)); and if you ignore the return value, you fail when using any IOBase object that isn't buffered (exactly this case).

    But you are definitely right, explicitly checking for None can be done. It adds a trivial amount of overhead, but this isn't a production server. The only cost is code readability.

    Good point about the explicit calls to flush(), I thought flush was a no-op in some streams, but that's only the case for streams where it doesn't already matter.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 9, 2016

    I suspect this bug does not affect Python 2. See my demo script wsgi-partial.py as evidence, which interrupts a 20 MB write with a signal.

    Jason: If you look at Python 2’s file.write() API <https://docs.python.org/2/library/stdtypes.html#file.write\>, perhaps that will convince you that it is easy for someone to write a write() method that has no return value.

    Anyway, I think this bug is bigger than first thought. It affects most servers based on socketserver.StreamRequestHandler, not just the “wsgiref” one. In Python 3.5 I propose to patch the existing servers to avoid the possibility of truncated writes (patch forthcoming). In 3.6, I am thinking we should change StreamRequestHandler.wfile to prevent write() doing partial writes.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 9, 2016

    wfile-partial.patch tries to avoid partial writes in every relevant wfile.write() call I could find. Usually I do this by building a BufferedWriter around wfile.

    I propose to apply my patch to 3.5 (if people think it is reasonable). For 3.6 I opened bpo-26721 to change the class used for wfile instead.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 9, 2016

    I did not write a test case for the test suite. Doing so would probably involve too many non-trivial things, so I thought it wouldn’t be worth it:

    1. A background thread or process to run a client in
    2. Signal handling to trigger a partial write
    3. Some way to synchronize the threads and the signal (sleep, sigwait, etc)

    @vadmium
    Copy link
    Member

    vadmium commented Apr 11, 2016

    In patch v2 I integrated a version of the test into test_wsgiref. I swapped the sleep() calls for a threading.Event object.

    Because my patch is such a large change, it would be good to get other people’s opinions or reviews.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 20, 2016

    Widening the scope of the title to attract reviewers :)

    @vadmium vadmium changed the title wsgiref.handlers.SimpleHandler truncates large output blobs Many servers (wsgiref, http.server, etc) can truncate large output blobs Apr 20, 2016
    @vadmium
    Copy link
    Member

    vadmium commented May 14, 2016

    Since there are three different people reporting the problem with wsgiref, but no other reports for other server modules, I might commit the change to Lib/wsgiref/simple_server.py soon, and leave the other changes until they can get a wider review.

    @serhiy-storchaka
    Copy link
    Member

    SimpleHTTPRequestHandler uses shutil.copyfileobj(), and shutil.copyfileobj() expects that write() writes all data.

    @vadmium
    Copy link
    Member

    vadmium commented May 16, 2016

    Thanks for your comments Serhiy. The main change in v3 is that I added a socketserver._write_exactly() function instead of the _BaseHTTPRequestHandler__write_exactly() method.

    @vadmium
    Copy link
    Member

    vadmium commented May 31, 2016

    Here is a patch for just the wsgiref module, which I plan to commit in time for 3.5.2. I also updated the test case to avoid hanging due to a (theoretical) race with how signal handling works.

    @serhiy-storchaka
    Copy link
    Member

    It looks to me that there is two opposite ways to resolve this issue.

    1. Document the requirement for write() and raise an exception if write() returned not the size of full data. But I think it would be more user-friendly first emit a warning and continue calling write() until full data is written.

    2. Don't document any requirement, but just silently handle partial writing.

    wsgiref-only.patch contains parts from both ways. It documents new requirement, but doesn't deprecate partial writing. What way is chosen?

    @vadmium
    Copy link
    Member

    vadmium commented Jun 2, 2016

    I prefer your first solution because it seems to fit in better with how things were intended.

    I can add in handling of partial writes with a deprecation warning when I get a chance. I guess the documentation would be something like “Since 3.5.2, partial writes are also handled, but this is deprecated.”

    @vadmium
    Copy link
    Member

    vadmium commented Jun 3, 2016

    In wsgiref-only.v2.patch I added deprecated support for continuing when a partial write is detected. I don’t think it needs documenting though.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 5, 2016

    New changeset a57703119f40 by Martin Panter in branch '3.5':
    Issue bpo-24291: Avoid WSGIRequestHandler doing partial writes
    https://hg.python.org/cpython/rev/a57703119f40

    New changeset d8f021944e0b by Martin Panter in branch 'default':
    Issue bpo-24291: Merge wsgi partial write fix from 3.5
    https://hg.python.org/cpython/rev/d8f021944e0b

    @vadmium
    Copy link
    Member

    vadmium commented Jun 5, 2016

    Here is an updated patch with the remaining fixes for other servers.

    I considered changing the behaviour of shutil.copyfileobj() to handle partial writes better. But I decided against it, because: (a) nobody has complained yet, and (b) it could potentially break stuff, e.g. if fsrc.read() returned a strange kind of object.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 23, 2016

    The bug should no longer be a problem in 3.6, as long as my change for bpo-26721 survives.

    Here is an updated patch against 3.5.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 27, 2017

    Closing because I understand it is too late to do anything for 3.5 now.

    @vadmium vadmium closed this as completed Oct 27, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants