Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutil.copyfileobj should internally use os.sendfile when possible #69249

Closed
desbma mannequin opened this issue Sep 10, 2015 · 7 comments
Closed

shutil.copyfileobj should internally use os.sendfile when possible #69249

desbma mannequin opened this issue Sep 10, 2015 · 7 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@desbma
Copy link
Mannequin

desbma mannequin commented Sep 10, 2015

BPO 25063
Nosy @vstinner, @giampaolo, @bitdancer, @vadmium, @desbma

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-09-14.12:02:59.888>
created_at = <Date 2015-09-10.18:38:50.069>
labels = ['type-feature', 'library']
title = 'shutil.copyfileobj should internally use os.sendfile when possible'
updated_at = <Date 2018-05-19.11:27:44.211>
user = 'https://github.com/desbma'

bugs.python.org fields:

activity = <Date 2018-05-19.11:27:44.211>
actor = 'giampaolo.rodola'
assignee = 'none'
closed = True
closed_date = <Date 2015-09-14.12:02:59.888>
closer = 'r.david.murray'
components = ['Library (Lib)']
creation = <Date 2015-09-10.18:38:50.069>
creator = 'desbma'
dependencies = []
files = []
hgrepos = []
issue_num = 25063
keywords = []
message_count = 7.0
messages = ['250401', '250406', '250426', '250503', '250578', '250586', '250657']
nosy_count = 6.0
nosy_names = ['vstinner', 'giampaolo.rodola', 'r.david.murray', 'python-dev', 'martin.panter', 'desbma']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue25063'
versions = []

@desbma
Copy link
Mannequin Author

desbma mannequin commented Sep 10, 2015

Sorry if it has already been discussed, or if this is a stupid idea.

By looking at the signature, my thought was that the only use of shutil.copyfileobj was to wrap the use of sendfile, and use a fallback if it is not available on the system or not usable with "fake" Python files (not having a file descriptor, eg. like gzip.GzipFile).

By looking at the implementation, I was surprised that it does not try to call os.sendfile.

@desbma desbma mannequin added the stdlib Python modules in the Lib dir label Sep 10, 2015
@bitdancer
Copy link
Member

No, it's to copy file like objects that aren't necessarily true file system objects. The only requirement is that they support a minimal file-like object interface. There is no suggestion or requirement that the object supports file descriptors (which would make it a real os file object).

I think changing it to use os.sendfile if possible would be too risky in terms of causing unexpected breakage (objects that look like they have a file descriptor but really don't), especially since handling file system files is explicitly not the purpose of copyfileobj.

@vadmium
Copy link
Member

vadmium commented Sep 10, 2015

I agree it is a nice idea, but have the same concern as David. Many pseudo file objects wrap real file descriptors. E.g. BufferedReader hides unread data in a buffer, GzipFile decompresses data, HTTPResponse decodes chunks. Detecting when sendfile() is appropriate would be hard to do. And I understand some platforms only support sockets.

Also I notice that 3.5 has a new socket.sendfile() method (bpo-17552), which partly overlaps with this proposal. BTW the 3.4 os documentation should not mention the socket method, but it does!

@vadmium vadmium added the type-feature A feature request or enhancement label Sep 10, 2015
@python-dev
Copy link
Mannequin

python-dev mannequin commented Sep 11, 2015

New changeset 3d9cbbad8a04 by Martin Panter <vadmium> in branch '3.4':
Issue bpo-25063: socket.sendfile() does not exist in 3.4
https://hg.python.org/cpython/rev/3d9cbbad8a04

@desbma
Copy link
Mannequin Author

desbma mannequin commented Sep 13, 2015

If I understand you correctly, a naive implementation like this:

use_chunk_copy = False
try:
  fdsrc = fsrc.fileno()
  fddst = fdst.fileno()
except OSError:
  # fallback on current chunk based Python copy
  use_chunk_copy = True
else:
  # we have 2 fd, try sendfile
  try:
    os.sendfile(fdsrc, fddst)
  except AttributeError:
    # sendfile not available
    use_chunk_copy = True
if use_chunk_copy:
  # use current Python based chunk copy

...would not work because some file like objects expose fileno(), even though it does not accurately represent the file because they wrap it to add special behavior (like GzipFile).

I don't understand, if they alter the file behavior, why do they expose fileno()?

@vadmium
Copy link
Member

vadmium commented Sep 13, 2015

I don’t know the exact reasoning, but fileno() could be useful for other purposes, e.g. passing to stat() to find the last modification time, passing to select() to see if data is available.

@bitdancer
Copy link
Member

I'm going to reject this. If you think there is enough value in this to warrant the complications that would be required to ensure backward compatibility (assuming that is even possible), you could bring it up again on python-ideas.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants