This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile.open does fails with UnicodeDecodeError if parent dir is unicode
Type: Stage:
Components: Versions: Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: parthm, vstinner
Priority: normal Keywords:

Created on 2010-04-14 09:50 by parthm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tarfilefail.py parthm, 2010-04-14 09:50
Messages (3)
msg103117 - (view) Author: Parth Malwankar (parthm) Date: 2010-04-14 09:50
If any directory higher up in the hierarchy contains unicode chars, tarfile.open fails with UnicodeDecodeError. Attached script reproduces this error.

[tmp]% python tarfilefail.py 
Traceback (most recent call last):
  File "tarfilefail.py", line 9, in <module>
    tarfile.open(file_name, 'w')
  File "/usr/lib/python2.6/tarfile.py", line 1682, in open
    return cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.6/tarfile.py", line 1692, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.6/tarfile.py", line 1527, in __init__
    self.name = os.path.abspath(name) if name else None
  File "/usr/lib/python2.6/posixpath.py", line 338, in abspath
    path = join(os.getcwd(), path)
  File "/usr/lib/python2.6/posixpath.py", line 70, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 17: ordinal not in range(128)
[tmp]%
msg103118 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-14 09:54
It looks like tarfile doesn't support unicode filenames. You should try to encode your input filename to the file system default encoding (sys.getfilesystemencoding()), or avoid using unicode for tar filenames.
msg103119 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-14 10:00
I proposed a workaround, the real bug is that os.path.abspath() doesn't support unicode... And this bug was already fixed 7 weeks ago by r78247 (issue #3426).
http://svn.python.org/view/python/trunk/Lib/posixpath.py?r1=78247&r2=78246&pathrev=78247

Python 2.6.5 and 2.7b1 are fixed. So please upgrade :-)
History
Date User Action Args
2022-04-11 14:56:59adminsetgithub: 52643
2010-04-14 10:00:16vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg103119
2010-04-14 09:54:30vstinnersetnosy: + vstinner
messages: + msg103118
2010-04-14 09:50:02parthmcreate