Title: [doc] Clarify bytes vs text with non-seeking tarfile stream
Components: Documentation, Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
Nosy List: chaica_, docs@python, martin.panter, slateny
Created on 2015-02-12 11:19 by chaica_, last changed 2022-04-11 14:58 by admin.

Messages (3)
msg235808 - (view) Author: Carl Chenet (chaica_) Date: 2015-02-12 11:19
I'm trying to use a tar stream to a Python tarfile object but each time I do have a  TypeError: can't concat bytes to str error

Here is my test:

import tarfile
import sys

tarobj ='r|', fileobj=sys.stdin)

$ tar cvf test.tar.gz tests/
$ tar -O -xvf test.tar | ./
Traceback (most recent call last):
  File "./", line 6, in <module>
    tarobj ='r|', fileobj=sys.stdin)
  File "/usr/lib/python3.4/", line 1578, in open
    t = cls(name, filemode, stream, **kwargs)
  File "/usr/lib/python3.4/", line 1470, in __init__
    self.firstmember =
  File "/usr/lib/python3.4/", line 2249, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.4/", line 1082, in fromtarfile
    buf =
  File "/usr/lib/python3.4/", line 535, in read
    buf = self._read(size)
  File "/usr/lib/python3.4/", line 543, in _read
    return self.__read(size)
  File "/usr/lib/python3.4/", line 569, in __read
    self.buf += buf
TypeError: can't concat bytes to str

Carl Chenet
msg235810 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-12 11:30
Using fileobj=sys.stdin.buffer instead should do the trick. The “tarfile” module would expect a binary stream, not a text stream.

Given the documentation currently says, “Use this variant in combination with e.g. sys.stdin, . . .”, I presume that is why you were using plain stdin. The documentation should be clarified.
msg264459 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-29 04:28
Looks like the _Stream docstring needs a similar fix regarding stdin and stdout. Also, it wouldn’t hurt to specify that the read() and write() methods should work with bytes, not text.
