This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile unnecessarily requires seekable files
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: johnsonm, lars.gustaebel
Priority: normal Keywords:

Created on 2009-06-10 15:46 by johnsonm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (6)
msg89206 - (view) Author: Michael K Johnson (johnsonm) Date: 2009-06-10 15:46
In python 2.6 (not 2.4, haven't checked 2.5), the __init__() method of
the TarFile class calls the tell() method on the tar file, which doesn't
work if you are reading from standard input or writing to standard
output, two very reasonable things to do with a tar file.

While there are cases where it is logical to seek within a tar file,
supporting those cases should not preclude the normal design case for
tar archives of streaming reads/writes, including tar files being
streamed between processes via pipes.  If the tell() method is not
implemented for the file object, then the seek() method of TarFile (and
any other methods that can be implemented only for seekable files) can
raise a reasonable exception.  Note that this also means that the next()
method should not need to seek() for non-seekable files; it should
assume that it is at the correct block and read from there.
msg89224 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-06-10 19:26
If I am not mistaken the functionality you look for is the streaming
mode of tarfile.open():

tar = tarfile.open(fileobj=sys.stdin, mode="r|*")

Does this solve your problem?
msg89226 - (view) Author: Michael K Johnson (johnsonm) Date: 2009-06-10 20:05
We are doing output, and mode='w|' works.  We were using
tarfile.TarFile, not realizing that the default constructor was an
unsupported and deprecated interface (!?!)
msg89241 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-06-11 08:54
tarfile.TarFile is neither unsupported nor deprecated. It is just too
low-level for everyday use.
msg89253 - (view) Author: Michael K Johnson (johnsonm) Date: 2009-06-11 15:10
OK, not intended for "everyday use"; I understand this as meaning that
it is considered primarily an internal interface, and thus one that has
an explicitly unstable API.  It is hard for me to guess that this would
be the case, since this intent is not documented in the docstrings or
comments of either TarFile.__init__() or TarFile.open()

If I'm understanding you correctly, this could be considered a
documentation bug; perhaps the docstring for TarFile.__init__() could
suggest using the open() method, except possibly within TarFile subclasses?

Sorry to be so confused here.  I hope I'm finally converging on
understanding...

Anyway, thanks for the help!
msg89275 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-06-12 12:35
It is no documentation bug either: tarfile.open() is prominently
featured right on the top of the first page of the tarfile module online
documentation. tarfile.Tarfile() follows right after it with a short
notice that tarfile.open() should better be used instead.
History
Date User Action Args
2022-04-11 14:56:49adminsetgithub: 50503
2009-06-12 12:35:51lars.gustaebelsetmessages: + msg89275
2009-06-11 15:10:11johnsonmsetmessages: + msg89253
2009-06-11 08:54:04lars.gustaebelsetmessages: + msg89241
2009-06-10 20:05:12johnsonmsetstatus: open -> closed

messages: + msg89226
2009-06-10 19:26:48lars.gustaebelsetassignee: lars.gustaebel

messages: + msg89224
nosy: + lars.gustaebel
2009-06-10 15:46:51johnsonmcreate