>> The transfer won't be faster mainly because it's really I/O bound.
>> But it will use less CPU, only because you're making less syscalls.
> Have you actually measured this?

vanilla over Gb/s:
real    0m9.035s
user    0m0.523s
sys     0m1.412s

block-sendfile over Gb/s:
real    0m9.683s
user    0m0.253s
sys     0m1.212s

full-sendfile over Gb/s:
real    0m9.014s
user    0m0.059s
sys     0m1.000s

As you can see, the throughput doesn't vary (the difference in "real
time" is just part of the variance).
However, the CPU usage (user+sys) is less for block-sendfile than send
loop, and less for full-sendfile than block-sendfile.

vanilla over loopback:
real    0m3.200s
user    0m0.541s
sys     0m0.702s

block-sendfile over loopback:
real    0m2.713s
user    0m0.248s
sys     0m0.197s

full-sendfile over loopback:
real    0m1.718s
user    0m0.055s
sys     0m0.082s

Same thing for loopback, except that here, zero-copy makes a
difference on the throughput because we're not I/O bound, but really
CPU/memory bound (and here sendfile of the complete file really
outperforms block-sendfile).

I don't have access to a 10Gb/s network, but basic math hints that
sendfile could make a difference on the overall throughput.
