This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add splice() to the os module
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pablogsal Nosy List: Michael.Felt, corona10, miss-islington, pablogsal, serhiy.storchaka, shihai1991, vstinner
Priority: normal Keywords: patch

Created on 2020-08-24 15:57 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
socket.patch vstinner, 2020-11-26 14:51
Pull Requests
URL Status Linked Edit
PR 21947 merged pablogsal, 2020-08-24 16:01
PR 23340 merged pablogsal, 2020-11-17 13:45
PR 23350 merged pablogsal, 2020-11-17 18:20
PR 23351 merged pablogsal, 2020-11-17 18:21
PR 23354 merged vstinner, 2020-11-17 21:42
PR 23608 merged pablogsal, 2020-12-02 03:14
Messages (30)
msg375851 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-08-24 15:57
The splice system call moves data between two file descriptors without copying between kernel address space and user address space.  This can be a very useful addition for libraries implementing low-level file management.
msg375852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-08-24 16:02
I don't recall the subtle differences between sendfile() and splice(). I recall that in early Linux versions, one was limited to sockets, and only on one side. But later, it became possible to pass two sockets, or one file on disk and one socket, etc.

Python exposes sendfile() as os.sendfile() since Python 3.3:
https://docs.python.org/dev/library/os.html#os.sendfile
msg375857 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-08-24 17:30
> I don't recall the subtle differences between sendfile() and splice().

Basically, splice() is specialized for pipes:


splice() only works if one of the file descriptors refer to a pipe. So you can use for e.g. socket-to-pipe or pipe-to-file without copying the data into userspace. But you can't do file-to-file copies with it.

sendfile() only works if the source file descriptor refers to something that can be mmap()ed (i.e. mostly normal files) and before 2.6.33 the destination must be a socket.
msg375872 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-08-25 07:17
The API of splice() looks complicated. How would you use it in Python?

Are off_in and off_out adjusted as in copy_file_range() and sendfile()? It is not clear from the man page. If they are, how would you return updated values?

Are you going to add vmsplice() and tee() too? Since it is Linux-specific API, would not be better to add a purposed module linux?
msg375873 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-08-25 09:21
> Are you going to add vmsplice() and tee() too? Since it is Linux-specific API, would not be better to add a purposed module linux?

It's not uncommon that a syscall added to the Linux kernel is later added to other platforms.

Example: getrandom() exists in Linux and Solaris.

Example: memfd_create() was designed in Linux, and added later to FreeBSD: https://github.com/freebsd/freebsd/commit/575e351fdd996f72921b87e71c2c26466e887ed2 (see bpo-41013).
msg375875 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-08-25 09:25
OpenBSD uses a different API:
https://man.openbsd.org/sosplice.9

int sosplice(struct socket *so, int fd, off_t max, struct timeval *tv);
int somove(struct socket *so, int wait);

"The function sosplice() is used to splice together a source and a drain socket."

"The function somove() transfers data from the source's receive buffer to the drain's send buffer."

"Socket splicing can be invoked from userland via the setsockopt(2) system-call at the SOL_SOCKET level with the socket option SO_SPLICE."
msg375876 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-08-25 09:47
> The API of splice() looks complicated. How would you use it in Python?

It has the same API as copy_file_range and other similar system calls that we already expose, so we just need to do the same thing we do there.

> Are off_in and off_out adjusted as in copy_file_range() and sendfile()? It is not clear from the man page. If they are, how would you return updated values?

It behaves the same as in copy_file_range() with the exception that one has to be None (the one associated with the pipe file descriptor). We don't return the updated values (neither we do in copy_file_range()).

> Are you going to add vmsplice() and tee() too? Since it is Linux-specific API, would not be better to add a purposed module linux?

We can certainly discuss adding vmsplice() and tee() (probably tee is more interesting), but in my humble oppinion that would be a different discussion.
msg375877 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-08-25 09:49
> OpenBSD uses a different API:

The semantics are considerably different (splice() is about pipes while sosplice() talks about general sockets). Also, the point of splice() is to skip copying from kernel buffers, but sosplice() does not mention that it does not copy between userspace and kernel space
msg375878 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-08-25 09:51
> Since it is Linux-specific API, would not be better to add a purposed module linux?

This is an interesting point, but I think that at this particular point it would be more confusing for users than not (normally people go to the os module for system calls) and as Victor mention, we would need to update the os module if some other operative system adds the system call later
msg378581 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-13 21:59
Heads up: I plant to land this next week in case someone could to do a review or has something against
msg381196 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-11-17 00:00
New changeset a57b3d30f66c90f42da751bf82256b9b22961ed0 by Pablo Galindo in branch 'master':
bpo-41625: Expose the splice() system call in the os module (GH-21947)
https://github.com/python/cpython/commit/a57b3d30f66c90f42da751bf82256b9b22961ed0
msg381228 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-17 12:23
> .. availability:: Linux kernel >= 2.6.17 or glibc >= 2.5 

Do you mean "Linux kernel >= 2.6.17 and glibc >= 2.5" ?

> .. data:: SPLICE_F_MOVE

Maybe also add "    .. versionadded:: 3.10" on these constants.
msg381233 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-11-17 13:44
> Do you mean "Linux kernel >= 2.6.17 and glibc >= 2.5" ?

My understanding is that glibc provides emulation for glibc >= 2.5

The section from the manpage says:

       The splice() system call first appeared in Linux 2.6.17; library
       support was added to glibc in version 2.5.

Not sure how to interpret that. You want to change the "or" to "and"?
msg381259 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-17 17:35
I reopen the issue. This issue broke Python compilation on AIX.

https://buildbot.python.org/all/#/builders/302/builds/377

configure: "checking for splice... yes"

"./Modules/posixmodule.c", line 15146.53: 1506-045 (S) Undeclared identifier SPLICE_F_MOVE.
"./Modules/posixmodule.c", line 15147.57: 1506-045 (S) Undeclared identifier SPLICE_F_NONBLOCK.
"./Modules/posixmodule.c", line 15148.53: 1506-045 (S) Undeclared identifier SPLICE_F_MORE.

make: 1254-004 The error code from the last command is 1.


The code:

/* constants for splice */
#ifdef HAVE_SPLICE
    if (PyModule_AddIntConstant(m, "SPLICE_F_MOVE", SPLICE_F_MOVE)) return -1;
    if (PyModule_AddIntConstant(m, "SPLICE_F_NONBLOCK", SPLICE_F_NONBLOCK)) return -1;
    if (PyModule_AddIntConstant(m, "SPLICE_F_MORE", SPLICE_F_MORE)) return -1;
#endif
msg381261 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-17 17:41
> The splice() system call first appeared in Linux 2.6.17;
> library support was added to glibc in version 2.5.

There is no emulation. It's just a function which wraps the syscall:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/splice.c;h=fe21cf1988c48ce887a22c9e5e5f36cbd653a4c8;hb=HEAD

I understand that you need Linux kernel >= 2.6.17 *and* glibc >= 2.5.
msg381268 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-11-17 18:13
New changeset fa96608513b6eafe48777f1a5504134939dcbebc by Pablo Galindo in branch 'master':
bpo-41625: Add versionadded to os.splice() constants (GH-23340)
https://github.com/python/cpython/commit/fa96608513b6eafe48777f1a5504134939dcbebc
msg381281 - (view) Author: miss-islington (miss-islington) Date: 2020-11-17 19:57
New changeset e59958f8b6815f51f6c33b6a613cf8467ca18a11 by Pablo Galindo in branch 'master':
bpo-41625: Specify that Linux >= 2.6.17 *and* glibc >= 2.5 are requir… (GH-23351)
https://github.com/python/cpython/commit/e59958f8b6815f51f6c33b6a613cf8467ca18a11
msg381282 - (view) Author: miss-islington (miss-islington) Date: 2020-11-17 19:57
New changeset 2a9eddf070f72060f62db1856a0af2e08729a46c by Pablo Galindo in branch 'master':
bpo-41625: Add a guard for Linux for splice() constants in the os module (GH-23350)
https://github.com/python/cpython/commit/2a9eddf070f72060f62db1856a0af2e08729a46c
msg381290 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-17 21:43
Nice, AIX can build again Python. But now the 3 tests fail since the test uses a pipe and a file, whereas on AIX, it seems like splice() requires one end to be a socket.

I wrote attached PR 23354 to skip the 3 tests on AIX.

======================================================================
ERROR: test_splice (test.test_os.FileTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 406, in test_splice
    i = os.splice(in_fd, write_fd, 5)
OSError: [Errno 57] Socket operation on non-socket

======================================================================
ERROR: test_splice_offset_in (test.test_os.FileTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 440, in test_splice_offset_in
    i = os.splice(in_fd, write_fd, bytes_to_copy, offset_src=in_skip)
OSError: [Errno 57] Socket operation on non-socket

======================================================================
ERROR: test_splice_offset_out (test.test_os.FileTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_os.py", line 479, in test_splice_offset_out
    i = os.splice(read_fd, out_fd, bytes_to_copy, offset_dst=out_seek)
OSError: [Errno 57] Socket operation on non-socket
msg381295 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-17 22:08
New changeset 1de61d3923840b29e847d311f0c7d4c5821d98e6 by Victor Stinner in branch 'master':
bpo-41625: Skip os.splice() tests on AIX (GH-23354)
https://github.com/python/cpython/commit/1de61d3923840b29e847d311f0c7d4c5821d98e6
msg381311 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-18 02:30
FYI I checked and AIX is fixed. All tests pass again on POWER6 AIX 3.x buildbot.
msg381327 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-11-18 11:54
Thanks a lot Victor
msg381888 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-11-26 09:00
This is still broken.

Since this was included in master - the AIX buildbot is failing to compile (https://buildbot.python.org/all/#/builders/438/builds/391 and https://buildbot.python.org/all/#/builders/302/builds/377)

Strangely enough - the first bot continues to fail compile at the same location - while the second bot (running in a different environment) starting passing compile and all tests starting with https://buildbot.python.org/all/#/builders/302/builds/406.

Note: bot 1 is using what I call (personal opinion) a mixed environment with some libraries coming from OSS packages and some from IBM AIX. bot 2 - relies on IBM AIX libraries.

++++++
Note: manual build on same system as bot 1 using gcc - gives same error:

aixtools@gcc119:[/home/aixtools/cpython/cpython-master]make V=1
        gcc -pthread -c -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall  -O  -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden  -I./Include/internal  -I. -I./Include    -DPy_BUILD_CORE  -DABIFLAGS='""'    -o Python/sysmodule.o ./Python/sysmodule.c
        gcc -pthread -c -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall  -O  -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden  -I./Include/internal  -I. -I./Include    -DPy_BUILD_CORE -o Modules/config.o Modules/config.c
        gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall  -O  -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden  -I./Include/internal  -I. -I./Include    -DPy_BUILD_CORE_BUILTIN  -DPy_BUILD_CORE_BUILTIN -I./Include/internal -c ./Modules/posixmodule.c -o Modules/posixmodule.o
./Modules/posixmodule.c: In function 'os_splice_impl':
./Modules/posixmodule.c:10429:15: error: implicit declaration of function 'splice'; did you mean 'plock'? [-Werror=implicit-function-declaration]
         ret = splice(src, p_offset_src, dst, p_offset_dst, count, flags);
               ^~~~~~
               plock
cc1: some warnings being treated as errors
make: 1254-004 The error code from the last command is 1.

* On same system, using xlc-v13, the build completes normally.
msg381902 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-26 14:51
> ./Modules/posixmodule.c:10429:15: error: implicit declaration of function 'splice'; did you mean 'plock'? [-Werror=implicit-function-declaration]

Is it possible that posixmodule.c lacks an #include to get the function on AIX?

On AIX 7.1, man splice says:

       #include <sys/types.h>
       #include <sys/socket.h>
       int splice(socket1, socket2, flags)
       int socket1, socket2;
       int flags;

posixmodule.c doesn't include it on AIX:

#if defined(__FreeBSD__) || defined(__DragonFly__) || defined(__APPLE__)
#  ifdef HAVE_SYS_SOCKET_H
#    include <sys/socket.h>
#  endif
#endif


Michael: Would you mind to try building the master branch of Python with attached socket.patch? (on the worker where Python no longer builds)
msg382290 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-02 03:17
Started custom build of PR 23608 in https://buildbot.python.org/all/#/buildrequests/84365
msg382291 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-02 03:17
https://buildbot.python.org/all/#/builders/526/builds/3
msg382292 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-02 03:32
Seems that adding #include <sys/socket.h> does not work so I am going to skip adding this function on AIX. If someone is interested in fixing it, they can remove the #ifdef and figure out what's going on with that buildbot
msg382302 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-02 10:55
> Seems that adding #include <sys/socket.h> does not work so I am going to skip adding this function on AIX.

I'm fine with not implementing the function on AIX for now.
msg382331 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-12-02 17:57
New changeset dedc2cd5f02a2e2fc2c995a36a3dccf9d93856bb by Pablo Galindo in branch 'master':
bpo-41625: Do not add os.splice on AIX due to compatibility issues (GH-23608)
https://github.com/python/cpython/commit/dedc2cd5f02a2e2fc2c995a36a3dccf9d93856bb
msg382768 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-12-08 21:54
Sorry Victor - family matters - so I was not watching for several days.
History
Date User Action Args
2022-04-11 14:59:35adminsetgithub: 85791
2020-12-08 21:54:24Michael.Feltsetmessages: + msg382768
2020-12-02 17:57:39pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-12-02 17:57:22pablogsalsetmessages: + msg382331
2020-12-02 10:55:01vstinnersetmessages: + msg382302
2020-12-02 03:32:39pablogsalsetmessages: + msg382292
2020-12-02 03:17:44pablogsalsetmessages: + msg382291
2020-12-02 03:17:32pablogsalsetmessages: + msg382290
2020-12-02 03:14:21pablogsalsetstage: resolved -> patch review
pull_requests: + pull_request22476
2020-11-26 14:51:13vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
2020-11-26 14:51:06vstinnersetfiles: + socket.patch

messages: + msg381902
2020-11-26 09:00:31Michael.Feltsetnosy: + Michael.Felt
messages: + msg381888
2020-11-18 11:54:30pablogsalsetmessages: + msg381327
2020-11-18 02:30:41vstinnersetmessages: + msg381311
2020-11-17 22:08:18vstinnersetmessages: + msg381295
2020-11-17 21:43:07vstinnersetmessages: + msg381290
2020-11-17 21:42:44vstinnersetpull_requests: + pull_request22247
2020-11-17 20:28:57pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-11-17 19:57:54miss-islingtonsetmessages: + msg381282
2020-11-17 19:57:11miss-islingtonsetnosy: + miss-islington
messages: + msg381281
2020-11-17 18:21:48pablogsalsetpull_requests: + pull_request22245
2020-11-17 18:20:16pablogsalsetstage: resolved -> patch review
pull_requests: + pull_request22244
2020-11-17 18:13:57pablogsalsetmessages: + msg381268
2020-11-17 17:41:04vstinnersetmessages: + msg381261
2020-11-17 17:35:03vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg381259
2020-11-17 13:45:19pablogsalsetpull_requests: + pull_request22231
2020-11-17 13:44:56pablogsalsetmessages: + msg381233
2020-11-17 12:23:05vstinnersetmessages: + msg381228
2020-11-17 00:01:57pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-11-17 00:00:45pablogsalsetmessages: + msg381196
2020-10-13 21:59:23pablogsalsetmessages: + msg378581
2020-08-25 16:24:49shihai1991setnosy: + shihai1991
2020-08-25 09:51:34pablogsalsetmessages: + msg375878
2020-08-25 09:49:50pablogsalsetmessages: + msg375877
2020-08-25 09:47:04pablogsalsetmessages: + msg375876
2020-08-25 09:25:17vstinnersetmessages: + msg375875
2020-08-25 09:21:45vstinnersetmessages: + msg375873
2020-08-25 07:17:35serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg375872
2020-08-25 06:03:29corona10setnosy: + corona10
2020-08-24 17:30:18pablogsalsetmessages: + msg375857
2020-08-24 16:02:11vstinnersetnosy: + vstinner
messages: + msg375852
2020-08-24 16:01:14pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request21057
2020-08-24 15:57:28pablogsalsetassignee: pablogsal
components: + Library (Lib)
versions: + Python 3.10
2020-08-24 15:57:18pablogsalcreate