classification
Title: shutil.copyfile(): os.sendfile() fails with OverflowError on 32-bit system
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: giampaolo.rodola, lukasz.langa, nanjekyejoannah, pablogsal, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2019-09-30 07:37 by vstinner, last changed 2019-10-02 14:39 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 16491 merged giampaolo.rodola, 2019-09-30 14:04
PR 16506 merged miss-islington, 2019-10-01 04:16
Messages (7)
msg353549 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-30 07:37
Error on a 32-bit buildbot worker where ssize_t maximum = 2,147,483,647 (2**31-1) bytes = ~2.0 GiB.

test_largefile uses:

# size of file to create (>2 GiB; 2 GiB == 2,147,483,648 bytes)
size = 2_500_000_000

x86 Gentoo Installed with X 3.x:
https://buildbot.python.org/all/#/builders/103/builds/3162

======================================================================
ERROR: test_it (test.test_largefile.TestCopyfile)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/test/test_largefile.py", line 160, in test_it
    shutil.copyfile(TESTFN, TESTFN2)
  File "/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/shutil.py", line 266, in copyfile
    _fastcopy_sendfile(fsrc, fdst)
  File "/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/shutil.py", line 145, in _fastcopy_sendfile
    sent = os.sendfile(outfd, infd, offset, blocksize)
OverflowError: Python int too large to convert to C ssize_t

On Linux (Fedora 30), man sendfile shows me the prototype:

       ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

Extract of Lib/shutil.py:

    # Hopefully the whole file will be copied in a single call.
    # sendfile() is called in a loop 'till EOF is reached (0 return)
    # so a bufsize smaller or bigger than the actual file size
    # should not make any difference, also in case the file content
    # changes while being copied.
    try:
        blocksize = max(os.fstat(infd).st_size, 2 ** 23)  # min 8MB
    except Exception:
        blocksize = 2 ** 27  # 128MB

    offset = 0
    while True:
        try:
            sent = os.sendfile(outfd, infd, offset, blocksize)
        except OSError as err:
            ...
        else:
            if sent == 0:
                break  # EOF
            offset += sent

Extract of the Linux implementation of os.sendfile():

    int in, out;
    Py_ssize_t ret;
    off_t offset;
    ...
    Py_ssize_t count;
    PyObject *offobj;
    static char *keywords[] = {"out", "in",
                                "offset", "count", NULL};
    if (!PyArg_ParseTupleAndKeywords(args, kwdict, "iiOn:sendfile",
            keywords, &out, &in, &offobj, &count))
        return NULL;
    ...
    if (!Py_off_t_converter(offobj, &offset))
        return NULL;

    do {
        Py_BEGIN_ALLOW_THREADS
        ret = sendfile(out, in, &offset, count);
        Py_END_ALLOW_THREADS
    } while (ret < 0 && errno == EINTR && !(async_err = PyErr_CheckSignals()));

with:

static int
Py_off_t_converter(PyObject *arg, void *addr)
{
#ifdef HAVE_LARGEFILE_SUPPORT
    *((Py_off_t *)addr) = PyLong_AsLongLong(arg);
#else
    *((Py_off_t *)addr) = PyLong_AsLong(arg);
#endif
    if (PyErr_Occurred())
        return 0;
    return 1;
}


I understand that the error comes from the 4th sendfile() parameter: "count". The C code (of the Linux implementation) uses the "n" format for Py_ssize_t: Python/getargs.c calls PyLong_AsSsize_t().

On a 64-bit system, it's less likely to reach Py_ssize_t maximum value (max = 2**63-1), but it's easy to reach on a 32-bit system (max = 2**31-1).
msg353551 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-30 07:44
Oh, it's likely a regression caused by:

commit 5bcc6d89bcb622a6786fff632fabdcaf67dbb4e2
Author: Giampaolo Rodola <g.rodola@gmail.com>
Date:   Mon Sep 30 12:51:55 2019 +0800

    bpo-37096: Add large-file tests for modules using sendfile(2) (GH-13676)


> https://buildbot.python.org/all/#/builders/103/builds/3162

configure:

checking for sendfile... yes
checking whether to enable large file support... yes

pythoninfo:

platform.libc_ver: glibc 2.29
platform.platform: Linux-4.19.72-gentoo-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_5000+-with-glibc2.29
msg353569 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-30 12:58
Similar failure on ARMv7 Debian buster 3.x:
https://buildbot.python.org/all/#/builders/176/builds/1372

pythoninfo:
sys.maxsize: 2147483647
msg353633 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2019-10-01 03:40
New changeset 94e165096fd65e8237e60de570fb609604ab94c9 by Giampaolo Rodola in branch 'master':
bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size (GH-16491)
https://github.com/python/cpython/commit/94e165096fd65e8237e60de570fb609604ab94c9
msg353637 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2019-10-01 05:01
Looks like it worked:
https://buildbot.python.org/all/#/builders/176/builds/1383
msg353645 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2019-10-01 07:55
New changeset 938c00ca9e4207a2531041edff2e82490b02047f by Łukasz Langa (Miss Islington (bot)) in branch '3.8':
bpo-38319: Fix shutil._fastcopy_sendfile(): set sendfile() max block size (GH-16491) (#16506)
https://github.com/python/cpython/commit/938c00ca9e4207a2531041edff2e82490b02047f
msg353738 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-02 14:39
32-bit buildbots are back to green. Thanks for the fix!
History
Date User Action Args
2019-10-02 14:39:28vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg353738

stage: patch review -> resolved
2019-10-01 07:55:07lukasz.langasetnosy: + lukasz.langa
messages: + msg353645
2019-10-01 05:01:18giampaolo.rodolasetversions: + Python 3.8
2019-10-01 05:01:12giampaolo.rodolasetmessages: + msg353637
2019-10-01 04:16:48miss-islingtonsetpull_requests: + pull_request16093
2019-10-01 03:40:58giampaolo.rodolasetmessages: + msg353633
2019-09-30 14:04:16giampaolo.rodolasetkeywords: + patch
stage: patch review
pull_requests: + pull_request16077
2019-09-30 12:58:52vstinnersetmessages: + msg353569
2019-09-30 07:44:21vstinnersetmessages: + msg353551
2019-09-30 07:37:51vstinnercreate