Author StyXman
Recipients StyXman, christian.heimes, martin.panter, neologix, pitrou, vstinner
Date 2016-04-26.15:09:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1461683357.44.0.272275015741.issue26826@psf.upfronthosting.co.za>
In-reply-to
Content
Ok, I have a preliminary version of the patch. It has several parts:

* Adding the functionality to the os module, with docstring.
* Make shutil.copyfileobj() to use it if available.
* Modify the docs (this has to be done by hand, right?).
* Modify NEWS and ACKS.

Several points:

* For the time being, flags must be 0, so I was not sure whether put the argument or not. Just in case, I put it.

* I'm not sure how to test for availability, so configure defines HAVE_COPY_FILE_RANGE.

* No tests yet.

Talking about tests, I tried copying a 325MiB on an SSD, f2fs. Here are the times:

Old user space copy:

$ time ./python -m timeit -n 10 -s 'import shutil' 'a = open ("a.mp4", "rb"); b = open ("b.mp4", "wb+"); shutil.copyfileobj (a, b, 16*1024*1024)'
10 loops, best of 3: 259 msec per loop
real    0m7.915s
user    0m0.104s
sys     0m7.792s


New copy_file_range:

$ time ./python -m timeit -n 10 -s 'import shutil' 'a = open ("a.mp4", "rb"); b = open ("b.mp4", "wb+"); shutil.copyfileobj (a, b, 16*1024*1024)'
10 loops, best of 3: 193 msec per loop
real    0m5.926s
user    0m0.080s
sys     0m5.836s

Some 20% improvement, but notice that the buffer size is 1024 times Python's default size (16MiB vs. 16KiB).

One difference that I notice in semantics is that if the file is not open in binary form, but the file is binary, you get no UnicodeDecodeError (because the data never reaches userspace).

Let me know what you think.
History
Date User Action Args
2016-04-26 15:09:17StyXmansetrecipients: + StyXman, pitrou, vstinner, christian.heimes, neologix, martin.panter
2016-04-26 15:09:17StyXmansetmessageid: <1461683357.44.0.272275015741.issue26826@psf.upfronthosting.co.za>
2016-04-26 15:09:17StyXmanlinkissue26826 messages
2016-04-26 15:09:17StyXmancreate