This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Use copy_file_range() in shutil.copyfile() (server-side copy)
Type: performance Stage:
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Albert.Zeyer, StyXman, desbma, facundobatista, giampaolo.rodola, martin.panter, ncoghlan, neologix, pablogsal, petr.viktorin, vstinner
Priority: normal Keywords: patch

Created on 2019-06-05 05:24 by giampaolo.rodola, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
patch.diff giampaolo.rodola, 2019-06-05 05:24
Messages (6)
msg344671 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2019-06-05 05:24
This is a follow up of issue33639 (zero-copy via sendfile()) and issue26828 (os.copy_file_range()). On [Linux 4.5 / glib 2.27] shutil.copyfile() will use os.copy_file_range() instead of os.sendfile(). According to my benchmarks performances are the same but when dealing with NFS copy_file_range() is supposed to attempt doing a server-side copy, meaning there will be no exchange of data between client and server, making the copy operation an order of magnitude faster.

Before proceeding unit-tests for big-file support should be added first (issue37096). We didn't hit the 3.8 deadline but I actually prefer to land this in 3.9 as I want to experiment with it a bit (copy_file_range() is quite new, issue26828 is still a WIP).
msg344679 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-05 07:29
Oh, I already created https://bugs.python.org/issue37157

Can we move the discussion there?
msg344680 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2019-06-05 07:32
issue37157 is for reflink / CoW copy, this one is not.
msg344691 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-05 09:53
> issue37157 is for reflink / CoW copy, this one is not.

Oh sorry, it seems like I misunderstood copy_file_range(). So it doesn't use/support CoW?
msg344693 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2019-06-05 10:04
Nope, it doesn't (see man page). We can simply use FICLONE (cp does the same).
msg383996 - (view) Author: Albert Zeyer (Albert.Zeyer) * Date: 2020-12-29 13:05
According to the man page of copy_file_range (https://man7.org/linux/man-pages/man2/copy_file_range.2.html), copy_file_range also should support copy-on-write:

>       copy_file_range() gives filesystems an opportunity to implement
>       "copy acceleration" techniques, such as the use of reflinks
>       (i.e., two or more inodes that share pointers to the same copy-
>       on-write disk blocks) or server-side-copy (in the case of NFS).

Is this wrong?

However, while researching more about FICLONE vs copy_file_range, I found e.g. this: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24399

Which suggests that there are other problems with copy_file_range?
History
Date User Action Args
2022-04-11 14:59:16adminsetgithub: 81340
2020-12-29 13:05:56Albert.Zeyersetnosy: + Albert.Zeyer
messages: + msg383996
2019-06-05 10:04:13giampaolo.rodolasetmessages: + msg344693
2019-06-05 09:53:20vstinnersetmessages: + msg344691
2019-06-05 07:32:06giampaolo.rodolasetmessages: + msg344680
2019-06-05 07:29:41vstinnersetmessages: + msg344679
2019-06-05 05:29:23giampaolo.rodolasettitle: Have shutil.copyfile() use copy_file_range() -> Use copy_file_range() in shutil.copyfile() (server-side copy)
2019-06-05 05:26:32giampaolo.rodolasetnosy: + facundobatista, ncoghlan, vstinner, StyXman, petr.viktorin, neologix, martin.panter, desbma, pablogsal
2019-06-05 05:24:51giampaolo.rodolacreate