classification
Title: Limit max sendfile chunk to 0x7ffff000
Type: Stage: resolved
Components: asyncio, Library (Lib) Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, giampaolo.rodola, vstinner, yselivanov
Priority: normal Keywords:

Created on 2018-11-06 16:41 by asvetlov, last changed 2018-11-06 22:37 by vstinner. This issue is now closed.

Messages (7)
msg329366 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2018-11-06 16:41
On Linux maximum data size for sendfile call is 0x7ffff000: 

sendfile() will transfer at most 0x7ffff000 (2,147,479,552) bytes,  returning  the  number  of  bytes  actually
transferred.  (This is true on both 32-bit and 64-bit systems.)

Limiting max block size to this value on all OSes makes sense: splitting transferring the very huge file into several syscalls doesn't hurt performance anyway.

Windows uses DWORD for size in TransmitFile, so the size is limited as well.
msg329386 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2018-11-06 20:59
Do you mean raising an exception if "count" argument is passed and > 2,147,479,552? In that case I think asyncio's sendfile() should simply do the math to transmit that many bytes by taking into account that os.sendfile() may return less bytes than requested. With non-blocking sockets in particular that is true regardless from the size being passed.
msg329387 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-06 21:32
os.write(data) can write less than len(data) bytes. It's by contract, as socket.send(data) can send less than len(data).

It is used to truncated the length argument to INT_MAX, to be able to cast it to an int on Windows. Extract of _Py_write():

#ifdef MS_WINDOWS
    if (count > 32767 && isatty(fd)) {
        /* Issue #11395: the Windows console returns an error (12: not
           enough space error) on writing into stdout if stdout mode is
           binary and the length is greater than 66,000 bytes (or less,
           depending on heap usage). */
        count = 32767;
    }
#endif
    if (count > _PY_WRITE_MAX) {
        count = _PY_WRITE_MAX;
    }

with:

#if defined(MS_WINDOWS) || defined(__APPLE__)
    /* On Windows, the count parameter of read() is an int (bpo-9015, bpo-9611).
       On macOS 10.13, read() and write() with more than INT_MAX bytes
       fail with EINVAL (bpo-24658). */
#   define _PY_READ_MAX  INT_MAX
#   define _PY_WRITE_MAX INT_MAX
#else
    /* write() should truncate the input to PY_SSIZE_T_MAX bytes,
       but it's safer to do it ourself to have a portable behaviour */
#   define _PY_READ_MAX  PY_SSIZE_T_MAX
#   define _PY_WRITE_MAX PY_SSIZE_T_MAX
#endif



Can we do something similar for sendfile()?
msg329388 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-06 21:35
> In that case I think asyncio's sendfile() should simply do the math to transmit that many bytes by taking into account that os.sendfile() may return less bytes than requested.

The internal sendfile() implementation in asyncio already loops until all bytes are sent. Extract of unix_events.py:

    def _sock_sendfile_native_impl(self, fut, registered_fd, sock, fileno,
                                   offset, count, blocksize, total_sent):
        ...
        try:
            sent = os.sendfile(fd, fileno, offset, blocksize)
        except (BlockingIOError, InterruptedError):
            ...
        else:
            if sent == 0:
                # EOF
                self._sock_sendfile_update_filepos(fileno, offset, total_sent)
                fut.set_result(total_sent)
            else:
                offset += sent
                total_sent += sent
                self.add_writer(fd, self._sock_sendfile_native_impl, fut, ...)

asyncio doesn't need to be modified.
msg329390 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-06 21:37
The manual page says:

       sendfile() will transfer  at  most  0x7ffff000  (2,147,479,552)  bytes,
       returning  the  number of bytes actually transferred.  (This is true on
       both 32-bit and 64-bit systems.)

I understand that you can pass a larger length, but a single sendfile() function call will never copy more than 0x7ffff000 bytes. I see nothing wrong here.

If you want to be sure, try to copy more bytes than 0x7ffff000 and see what happens :-) Use strace to check system calls.
msg329393 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2018-11-06 22:31
My initial thought was that os.sendfile() raises OSError on sending more than 0x7fff_f000 but I was wrong.
I've checked it, everything works as Victor described.

Thank you guys for the feedback.
msg329394 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-06 22:37
> I've checked it, everything works as Victor described.

Yeah! That's great when it just works!
History
Date User Action Args
2018-11-06 22:37:23vstinnersetmessages: + msg329394
2018-11-06 22:31:55asvetlovsetstatus: open -> closed
resolution: not a bug
messages: + msg329393

stage: resolved
2018-11-06 21:37:11vstinnersetmessages: + msg329390
2018-11-06 21:35:06vstinnersetmessages: + msg329388
2018-11-06 21:32:10vstinnersetnosy: + vstinner
messages: + msg329387
2018-11-06 20:59:54giampaolo.rodolasetmessages: + msg329386
2018-11-06 16:42:33asvetlovsetnosy: + giampaolo.rodola, - ronaldoussoren, ned.deily
2018-11-06 16:42:08asvetlovsetnosy: + yselivanov
components: + Library (Lib), asyncio, - macOS
2018-11-06 16:41:31asvetlovcreate