New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many servers (wsgiref, http.server, etc) can truncate large output blobs #68479
Comments
The _write method of wsgiref.handlers.SimpleHandler reads as follows: def _write(self,data):
self.stdout.write(data) The problem here is that calling write() on a socket is not actually guaranteed to write all of the data in the buffer. If the length of data is large enough, then the kernel will take only part of it. In that case, the rest of the response data will be silently discarded by wsgiref. _write needs to check the return value of self.stdout.write(data), and if it is less than the length of data, repeat the write from the middle of the data buffer, etc., until all the data has been written. |
[UTF-8 error workaround] What kind of object is “stdout”? Plain Python socket objects don’t have a write() method. Perhaps “stdout” is meant to implement the BufferedIOBase.write() interface, which guarantees everything is written, even if it takes multiple raw calls. Wrapping your output stream in BufferedWriter might be a solution. |
When executing, wsgiref produces the following error dump: 0 errors found
May 26, 2015 - 16:20:56
Django version 1.5.8, using settings 'settings'
Development server is running at http://127.0.0.1:8765/
Quit the server with CTRL-BREAK.
[26/May/2015 16:20:56] "GET / HTTP/1.1" 200 29661
100
[26/May/2015 16:20:59] "POST /login HTTP/1.1" 200 25980
[26/May/2015 16:20:59] "GET /static/js/ready.js HTTP/1.1" 500 79563
Traceback (most recent call last):
File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
line 86, in run
self.finish_response()
File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
line 128, in finish_response
self.write(data)
File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\Lib\wsgiref\handlers.py",
line 217, in write
self._write(data)
File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\socket.py", line 324,
in write
[26/May/2015 16:20:59] "GET /static/js/ClasePrincipalAutobahn.js HTTP/1.1" 500 7
9818
[26/May/2015 16:20:59] "GET /static/img/initialGUIimage.png HTTP/1.1" 500 79777
self.flush()
File "D:\WinPython-64bit-2.7.6.3\python-2.7.6.amd64\lib\socket.py", line 303,
in flush
self._sock.sendall(view[write_offset:write_offset+buffer_size])
error: [Errno 10053] An established connection was aborted by the software in yo
ur host machine |
Marc, if you think your connection aborted error is related, can you say what kind of object the “stdout” parameter is in the handler class? In other words, what class’s method is being called in the write() function at line 217? |
Actually, looking closer at Marc’s traceback, it appears to be the ordinary socket._fileobject class. So I suspect it is unrelated to this bug because the Python 2 socket code appears to call sendall() in a loop. |
Hi all, I did more tests, identifying that the main issue I have resides in another script (multiprocessing), when using nuitka compiler. I will get back to you with more info when I solve the multiprocessing issue. The command prompt dump is: data: HTTP/1.0 200 OK |
I had the same issue as Jonathan with Python 3.5.1, big chunks of data get truncated (to ~85KB). The problem disappears when splitting original data into smaller chunks: def _write(self,data):
chunk_size = 1024
for chunk in (data[i:i+chunk_size] for i in range(0, len(data), chunk_size)):
self.stdout.write(chunk) |
gevent has another simple reproducer for this. I do believe it's not gevent's fault, the fault is in the standard library; SimpleHandler._write needs to loop until I have written up more on this at gevent/gevent#778 (comment) For convenience I'll reproduce the bulk of that comment here: The user supplied a django application that produced a very large response that was getting truncated when using gevent under Python 3.4. (I believe gevent's non-blocking sockets are simply running into a different buffering behaviour, making it more likely to be hit under those conditions simply because they are faster). This looks like a bug in the standard library's I tracked this down to a call to In this case, there is a call to
Note that there's no retry on the short send. Here's the stack trace for that short send; we can clearly see that there is no retry loop in place:
|
Django uses a |
Okay now I see the conflict. The use of WSGIRequestHandler with wbufsize = 0 was the missing key. I see two possible solutions:
I am not super familiar with the wsgiref package, but the second option seems more preferable to me and more in line with my understanding of how it was intended to work. |
Is there an expected |
The WSGI spec says that each chunk the application yields should be written immediately, with no buffering (https://www.python.org/dev/peps/pep-3333/#buffering-and-streaming), so I don't think having the default output stream be buffered would be in compliance. If there is a compatibility problem, writing the loop this way could bypass it (untested): def _write(self, data):
written = self.stdout.write(data)
while written is not None and written < len(data):
written += self.stdout.write(data[written:]) |
My worry was that it is easy to make a write() method that does not return anything, but is still useful in most cases. Since BufferedIOBase.write() has to guarantee to write everything, it may not seem important to return a value. But we could explicitly check for None as you suggested. In the BaseHandler class, each chunk yielded by the application is passed to BaseHandler.write(). But that method calls self._flush(), which should avoid any buffering problem. SimpleHandler._flush() implements this by calling self.stdout.flush(). |
I'm sorry, I'm still not following the argument that But you are definitely right, explicitly checking for None can be done. It adds a trivial amount of overhead, but this isn't a production server. The only cost is code readability. Good point about the explicit calls to |
I suspect this bug does not affect Python 2. See my demo script wsgi-partial.py as evidence, which interrupts a 20 MB write with a signal. Jason: If you look at Python 2’s file.write() API <https://docs.python.org/2/library/stdtypes.html#file.write\>, perhaps that will convince you that it is easy for someone to write a write() method that has no return value. Anyway, I think this bug is bigger than first thought. It affects most servers based on socketserver.StreamRequestHandler, not just the “wsgiref” one. In Python 3.5 I propose to patch the existing servers to avoid the possibility of truncated writes (patch forthcoming). In 3.6, I am thinking we should change StreamRequestHandler.wfile to prevent write() doing partial writes. |
wfile-partial.patch tries to avoid partial writes in every relevant wfile.write() call I could find. Usually I do this by building a BufferedWriter around wfile. I propose to apply my patch to 3.5 (if people think it is reasonable). For 3.6 I opened bpo-26721 to change the class used for wfile instead. |
I did not write a test case for the test suite. Doing so would probably involve too many non-trivial things, so I thought it wouldn’t be worth it:
|
In patch v2 I integrated a version of the test into test_wsgiref. I swapped the sleep() calls for a threading.Event object. Because my patch is such a large change, it would be good to get other people’s opinions or reviews. |
Widening the scope of the title to attract reviewers :) |
Since there are three different people reporting the problem with wsgiref, but no other reports for other server modules, I might commit the change to Lib/wsgiref/simple_server.py soon, and leave the other changes until they can get a wider review. |
SimpleHTTPRequestHandler uses shutil.copyfileobj(), and shutil.copyfileobj() expects that write() writes all data. |
Thanks for your comments Serhiy. The main change in v3 is that I added a socketserver._write_exactly() function instead of the _BaseHTTPRequestHandler__write_exactly() method. |
Here is a patch for just the wsgiref module, which I plan to commit in time for 3.5.2. I also updated the test case to avoid hanging due to a (theoretical) race with how signal handling works. |
It looks to me that there is two opposite ways to resolve this issue.
wsgiref-only.patch contains parts from both ways. It documents new requirement, but doesn't deprecate partial writing. What way is chosen? |
I prefer your first solution because it seems to fit in better with how things were intended. I can add in handling of partial writes with a deprecation warning when I get a chance. I guess the documentation would be something like “Since 3.5.2, partial writes are also handled, but this is deprecated.” |
In wsgiref-only.v2.patch I added deprecated support for continuing when a partial write is detected. I don’t think it needs documenting though. |
New changeset a57703119f40 by Martin Panter in branch '3.5': New changeset d8f021944e0b by Martin Panter in branch 'default': |
Here is an updated patch with the remaining fixes for other servers. I considered changing the behaviour of shutil.copyfileobj() to handle partial writes better. But I decided against it, because: (a) nobody has complained yet, and (b) it could potentially break stuff, e.g. if fsrc.read() returned a strange kind of object. |
The bug should no longer be a problem in 3.6, as long as my change for bpo-26721 survives. Here is an updated patch against 3.5. |
Closing because I understand it is too late to do anything for 3.5 now. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: