classification
Title: [doc] Clarify bytes vs text with non-seeking tarfile stream
Type: behavior Stage: needs patch
Components: Documentation, Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: chaica_, docs@python, martin.panter
Priority: normal Keywords: easy

Created on 2015-02-12 11:19 by chaica_, last changed 2021-12-01 10:50 by iritkatriel.

Messages (3)
msg235808 - (view) Author: Carl Chenet (chaica_) Date: 2015-02-12 11:19
I'm trying to use a tar stream to a Python tarfile object but each time I do have a  TypeError: can't concat bytes to str error

Here is my test:
-----8<-----
#!/usr/bin/python3.4

import tarfile
import sys

tarobj = tarfile.open(mode='r|', fileobj=sys.stdin)
print(tarobj)
tarobj.close()
-----8<-----


$ tar cvf test.tar.gz tests/
tests/
tests/foo1
tests/foo/
tests/foo/bar
$ tar -O -xvf test.tar | ./tarstream.py
tests/
tests/foo1
tests/foo/
tests/foo/bar
Traceback (most recent call last):
  File "./tarstream.py", line 6, in <module>
    tarobj = tarfile.open(mode='r|', fileobj=sys.stdin)
  File "/usr/lib/python3.4/tarfile.py", line 1578, in open
    t = cls(name, filemode, stream, **kwargs)
  File "/usr/lib/python3.4/tarfile.py", line 1470, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python3.4/tarfile.py", line 2249, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.4/tarfile.py", line 1082, in fromtarfile
    buf = tarfile.fileobj.read(BLOCKSIZE)
  File "/usr/lib/python3.4/tarfile.py", line 535, in read
    buf = self._read(size)
  File "/usr/lib/python3.4/tarfile.py", line 543, in _read
    return self.__read(size)
  File "/usr/lib/python3.4/tarfile.py", line 569, in __read
    self.buf += buf
TypeError: can't concat bytes to str

Regards,
Carl Chenet
msg235810 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-12 11:30
Using fileobj=sys.stdin.buffer instead should do the trick. The “tarfile” module would expect a binary stream, not a text stream.

Given the documentation currently says, “Use this variant in combination with e.g. sys.stdin, . . .”, I presume that is why you were using plain stdin. The documentation should be clarified.
msg264459 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-29 04:28
Looks like the _Stream docstring needs a similar fix regarding stdin and stdout. Also, it wouldn’t hurt to specify that the read() and write() methods should work with bytes, not text.
History
Date User Action Args
2021-12-01 10:50:26iritkatrielsetkeywords: + easy
title: Clarify bytes vs text with non-seeking tarfile stream -> [doc] Clarify bytes vs text with non-seeking tarfile stream
versions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.5, Python 3.6
2016-04-29 04:28:32martin.pantersettitle: Opening a stream with tarfile.open() triggers a TypeError: can't concat bytes to str error -> Clarify bytes vs text with non-seeking tarfile stream
messages: + msg264459
versions: + Python 3.5, Python 3.6, - Python 3.4
2016-02-09 23:07:22martin.pantersettype: crash -> behavior
stage: needs patch
2015-02-12 11:30:06martin.pantersetnosy: + docs@python, martin.panter
messages: + msg235810

assignee: docs@python
components: + Documentation
2015-02-12 11:19:58chaica_create