classification
Title: Avoid socketserver.StreamRequestHandler.wfile doing partial writes
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: haypo, martin.panter, python-dev
Priority: normal Keywords: patch

Created on 2016-04-09 12:19 by martin.panter, last changed 2016-06-29 12:06 by martin.panter. This issue is now closed.

Files
File name Uploaded Description Edit
buffered-wfile.patch martin.panter, 2016-04-10 02:41 review
buffered-wfile.v2.patch martin.panter, 2016-05-14 07:26 review
buffered-wfile.v3.patch martin.panter, 2016-06-12 14:16 review
buffered-wfile.v4.patch martin.panter, 2016-06-12 14:23 review
Messages (11)
msg263084 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-09 12:19
This is a follow-on from Issue 24291. Currently, for stream servers (as opposed to datagram servers), the wfile attribute is a raw SocketIO object. This means that wfile.write() is a simple wrapper around socket.send(), and can do partial writes.

There is a comment inherited from Python 2 that reads:

# . . . we make
# wfile unbuffered because (a) often after a write() we want to
# read and we need to flush the line; (b) big writes to unbuffered
# files are typically optimized by stdio even when big reads
# aren't.

Python 2 only has one kind of “file” object, and it seems partial writes are impossible: <https://docs.python.org/2/library/stdtypes.html#file.write>. But in Python 3, unbuffered mode means that the lower-level RawIOBase API is involved rather than the higher-level BufferedIOBase API.

I propose to change the “wfile” attribute to be a BufferedIOBase object, yet still be “unbuffered”. This could be implemented with a class that looks something like

class _SocketWriter(BufferedIOBase):
    """Simple writable BufferedIOBase implementation for a socket
    
    Does not hold data in a buffer, avoiding any need to call flush()."""
    def write(self, b):
        self._sock.sendall(b)
        return len(b)
msg263120 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-10 02:41
Here is a patch with my proposal.
msg263400 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-04-14 12:40
Hum, since long time ago, Python has issues with partial write. It's hard to guess if a write will always write all data, store the data on partial write, or simply ignore remaining data on partial write.

I recall a "write1" function which was well defined: limited to 1 syscall, don't try (or maybe only on the very specific case of EINTR). But I'm not sure that it still exists in the io module of Python 3.

asyncio has also issues with the definition of "partial write" in its API.

You propose to fix the issue in socketserver.

socket.makefile(bufsize=0).write() uses send() and so use partial write. Are you sure that users are prepared for that? Maybe SocketIO must be modified to use sendall() when bufsize=0?
msg263401 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-04-14 12:42
FYI I recently worked on a issue with partial write in eventlet on Python 3:
* https://github.com/eventlet/eventlet/issues/274
* https://github.com/eventlet/eventlet/issues/295

By the way, I opened #26292 "Raw I/O writelines() broken for non-blocking I/O" as a follow-up of the eventlet issue.
msg263402 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-04-14 12:45
> I recall a "write1" function which was well defined: limited to 1 syscall, don't try (or maybe only on the very specific case of EINTR). But I'm not sure that it still exists in the io module of Python 3.

Oops, in fact it is read1:
https://docs.python.org/dev/library/io.html#io.BufferedIOBase.read1
"Read and return up to size bytes, with at most one call to the underlying raw stream’s read()"
msg263437 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-15 00:50
If a user calls makefile(bufsize=0), they may have written that based on Python 2’s behaviour of always doing full writes. But in Python 3 this is indirectly documented as doing partial writes: makefile() args interpreted as for open, and open() buffering disabled returns a RawIOBase subclass.

People porting code from Python 2 may be unprepared for partial writes. Just another subtle Python 2 vs 3 incompatibility. People using only Python 3 might be unprepared if they are not familar with the intricacies of the Python API, but then why are they using bufsize=0? On the other hand, they might require partial writes if they are using select() or similar, so changing it would be a compatibility problem.

You could use the same arguments for socketserver. The difference is that wfile being in raw mode is not documented. Almost all of the relevant builtin library code that I reviewed expects full blocking writes, and I did not find any that requires partial writes. So I think making the change in socketserver is less likely to introduce compatibility problems, and is much more beneficial.
msg265507 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-14 07:26
Merged with current code, and copied the original wsgiref test case from Issue 24291, since this patch also fixes that bug.
msg268384 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-12 14:16
Merged with current code, and migrated the interrupted-write test from test_wsgiref into test_socketserver.
msg268386 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-12 14:23
Forgot to remove the workaround added to 3.5 for wsgiref in Issue 24291
msg269474 - (view) Author: Roundup Robot (python-dev) Date: 2016-06-29 10:36
New changeset 4ea79767ff75 by Martin Panter in branch 'default':
Issue #26721: Change StreamRequestHandler.wfile to BufferedIOBase
https://hg.python.org/cpython/rev/4ea79767ff75
msg269481 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-29 12:06
Committed for 3.6.
History
Date User Action Args
2016-06-29 12:06:14martin.pantersetstatus: open -> closed
resolution: fixed
messages: + msg269481

stage: patch review -> resolved
2016-06-29 10:36:51python-devsetnosy: + python-dev
messages: + msg269474
2016-06-12 14:23:42martin.pantersetfiles: + buffered-wfile.v4.patch

messages: + msg268386
2016-06-12 14:16:38martin.pantersetfiles: + buffered-wfile.v3.patch

messages: + msg268384
2016-05-14 07:26:55martin.pantersetfiles: + buffered-wfile.v2.patch

messages: + msg265507
2016-04-23 01:52:46martin.pantersetstage: patch review
2016-04-15 00:51:00martin.pantersetmessages: + msg263437
2016-04-14 12:45:58hayposetmessages: + msg263402
2016-04-14 12:42:31hayposetmessages: + msg263401
2016-04-14 12:40:18hayposetnosy: + haypo
messages: + msg263400
2016-04-10 02:41:19martin.pantersetfiles: + buffered-wfile.patch
keywords: + patch
messages: + msg263120
2016-04-09 12:19:55martin.pantercreate