classification
Title: tarfile.ExFileObject can't be wrapped using io.TextIOWrapper
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: amaury.forgeotdarc, barry, cjwatson, eric.araujo, lars.gustaebel, pitrou, python-dev, r.david.murray, terry.reedy
Priority: normal Keywords: patch

Created on 2012-01-18 13:16 by cjwatson, last changed 2012-05-14 11:22 by lars.gustaebel. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile-exfileobject-flush.patch cjwatson, 2012-01-18 13:16 Add flush method to tarfile.ExFileObject
tarfile-exfileobj.diff lars.gustaebel, 2012-05-10 09:44 review
Messages (18)
msg151536 - (view) Author: Colin Watson (cjwatson) Date: 2012-01-18 13:16
The file-like object returned by TarFile.extractfile can't be wrapped in an io.TextIOWrapper (which would be rather convenient in some cases to get something that reads str rather than bytes).

The attached patch demonstrates the problem by way of a test case, and fixes it.  It's just a matter of adding a no-op flush method so that TextIOWrapper.close is happy with it.
msg151719 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-21 04:20
Based on other examples in the doc, I think the note
"... and also supports iteration over its lines."
should be extended with 
" It also has a dummy `flush` method, so that it can be wrapped using :class:`io.TextIOWrapper`."

Then just add
".. versionchanged:: 3.3
      Added the `flush` method."

I leave the test to Lars.
msg152517 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-03 14:21
Please always use explicit roles in reST, i.e. :meth:`flush` instead of `flush` (use ``flush`` when you don’t want a ton of identical links).

In the test, using assertEqual instead of assertTrue will certainly give more useful output in case of failure.
msg159999 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-05 16:16
New changeset 254cb4f5d0ff by Lars Gustäbel in branch 'default':
Issue #13815: TarFile.extractfile() now returns io.BufferedReader objects.
http://hg.python.org/cpython/rev/254cb4f5d0ff
msg160002 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-05-05 16:29
I did some tarfile spring cleaning: I removed the ExFileObject class completely as it was more or less a leftover from the old days. io.BufferedReader now does the job. So, as a side-effect, I close this issue as fixed.

(BTW, this makes tarfile.py smaller by about 100 lines.)
msg160217 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-05-08 18:16
I think it would have been better to keep the ExFileObject class, and base it on io.BufferedReader:

class ExFileObject(io.BufferedReader):
    def __init__(self, tarfile, tarinfo):
        raw = _FileInFile(tarfile.fileobj,
                          tarinfo.offset_data,
                          tarinfo.size,
                          tarinfo.sparse)
        io.BufferedReader.__init__(self, raw)

The result is the same of course, but there is no need to special-case the pre-3.3 API.
In addition, _FileInFile could probably inherit from io.RawIOBase.
msg160234 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-08 23:48
Indeed, even though it is not a documented API, our backward compatibility policy pretty much requires that something named ExFileObject still exist, just in case.  And in this case it probably should still be the thing returned.
msg160245 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-09 06:22
Well, if it's not documented, it's technically a private API.

Also, there doesn't seem to be any explicit use of ExFileObject outside of tarfile.py:
http://code.google.com/codesearch#search&q=lang:python+exfileobject
msg160276 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-05-09 11:44
In an earlier draft of my patch, I had kept ExFileObject as a subclass of BufferedReader, but I later decided against it. To use BufferedReader directly is in my opinion the cleaner solution.

I admit that the change is not fully backward compatible. But a user can still write code that works for both 3.3 and the versions before. If he didn't subclass ExFileObject his code doesn't even need a change. If he subclassed ExFileObject, he might have a problem in either case: either the ExFileObject class is missing, or he may be unable to use it the way he did before, because all that's left of it is a stub subclass of BufferedReader.

I am well aware that backward compatibility is most important, but I think it must still be allowed to change internal (and undocumented) APIs every now and then to clean things up a little.
And of course, I did a code search before too, and found no code using ExFileObject. This actually doesn't surprise me, as there is really not much you can do with it.
msg160287 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-09 13:33
Yeah, I know it is technically private.  We still tend to keep names around unless there's a good reason to delete them (like using them leads to broken code anyway).  The code search is some evidence this deletion would be OK, but why *not* follow Amaury's suggestion?

I'm OK if you reclose this, but I unfortunately I don't think simple cleanliness is a good argument (even though I would like it to be).  The other arguments are better :)
msg160289 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-09 13:41
> Yeah, I know it is technically private.  We still tend to keep names
> around unless there's a good reason to delete them (like using them
> leads to broken code anyway).  The code search is some evidence this
> deletion would be OK, but why *not* follow Amaury's suggestion?

I don't see the point of maintaining a private API that's proven to be
unused :) It's an unwarranted maintenance burden (though admittedly a
light one here).
msg160294 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-09 14:18
Code search is not proof, I'm afraid.  It is evidence, though, and I thought I indicated I thought it was a good argument in favor of dropping the class.
msg160296 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-09 14:40
> Code search is not proof, I'm afraid.  It is evidence, though, and I
> thought I indicated I thought it was a good argument in favor of
> dropping the class.

Yes, sorry for the vocabulary mismatch :-)
msg160298 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-05-09 15:34
I came here when I saw this comment in the diff: "# Keep the traditional pre-3.3 API intact".  Why keep an internal API intact if we do it partially?

The ExFileObject class above will also simplify the code: simply "return self.fileobject(self, tarinfo)" in all cases.
msg160331 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-05-10 09:44
Okay, I attached a patch that I hope we can all agree upon. It restores the ExFileObject class as a small subclass of BufferedReader as Amaury suggested. Does the documentation have to be changed, too? It states that an io.BufferedReader object is returned by extractfile() not a subclass thereof.
msg160340 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-05-10 12:46
I don't think a doc change is needed.  An isinstance check based on the docs will succeed, and the rest is an implementation detail, I think.
msg160604 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-14 11:20
New changeset ab6496b98ac4 by Lars Gustäbel in branch 'default':
Issue #13815: Resurrect the ExFileObject class.
http://hg.python.org/cpython/rev/ab6496b98ac4
msg160605 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-05-14 11:22
Okay, I close this issue now, as I think the problems are now resolved.
History
Date User Action Args
2012-05-14 11:22:44lars.gustaebelsetstatus: open -> closed

messages: + msg160605
2012-05-14 11:20:16python-devsetmessages: + msg160604
2012-05-10 12:46:39r.david.murraysetmessages: + msg160340
2012-05-10 09:44:19lars.gustaebelsetfiles: + tarfile-exfileobj.diff

messages: + msg160331
2012-05-09 15:34:51amaury.forgeotdarcsetmessages: + msg160298
2012-05-09 14:40:07pitrousetmessages: + msg160296
2012-05-09 14:18:17r.david.murraysetmessages: + msg160294
2012-05-09 13:41:46pitrousetmessages: + msg160289
2012-05-09 13:33:35r.david.murraysetmessages: + msg160287
2012-05-09 11:44:30lars.gustaebelsetmessages: + msg160276
2012-05-09 06:22:07pitrousetnosy: + pitrou
messages: + msg160245
2012-05-08 23:48:59r.david.murraysetstatus: closed -> open
nosy: + r.david.murray
messages: + msg160234

2012-05-08 18:16:40amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg160217
2012-05-05 16:29:43lars.gustaebelsetstatus: open -> closed
resolution: fixed
messages: + msg160002

stage: patch review -> resolved
2012-05-05 16:16:19python-devsetnosy: + python-dev
messages: + msg159999
2012-02-03 14:21:37eric.araujosetnosy: + eric.araujo
messages: + msg152517
2012-01-21 04:20:33terry.reedysetnosy: + terry.reedy
messages: + msg151719

type: enhancement
stage: patch review
2012-01-18 20:20:11lars.gustaebelsetassignee: lars.gustaebel

nosy: + lars.gustaebel
2012-01-18 15:49:48barrysetnosy: + barry
2012-01-18 13:16:16cjwatsoncreate