This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients benjamin.peterson, ezio.melotti, lars.gustaebel, lemburg, pitrou, serhiy.storchaka, vstinner
Date 2013-12-07.17:37:39
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1386437860.38.0.0478693655005.issue19920@psf.upfronthosting.co.za>
In-reply-to
Content
TarFile.list() fails on some files. In particular on Lib/test/testtar.tar.

>>> import tarfile
>>> tarfile.open('Lib/test/testtar.tar').list()
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 ustar/conttype 
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 ustar/regtype 
?rwxr-xr-x tarfile/tarfile          0 2003-01-06 01:19:43 ustar/dirtype/ 
?rwxr-xr-x tarfile/tarfile        255 2003-01-06 01:19:43 ustar/dirtype-with-size/ 
?rw-r--r-- tarfile/tarfile          0 2003-01-06 01:19:43 ustar/lnktype link to ustar/regtype 
?rwxrwxrwx tarfile/tarfile          0 2003-01-06 01:19:43 ustar/symtype -> regtype 
?rw-rw---- tarfile/tarfile        3,0 2003-01-06 01:19:43 ustar/blktype 
?rw-rw-rw- tarfile/tarfile        1,3 2003-01-06 01:19:43 ustar/chrtype 
?rw-r--r-- tarfile/tarfile          0 2003-01-06 01:19:43 ustar/fifotype 
?rw-r--r-- tarfile/tarfile      86016 2003-01-06 01:19:43 ustar/sparse 
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/tarfile.py", line 1846, in list
    print(tarinfo.name + ("/" if tarinfo.isdir() else ""), end=' ')
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc4' in position 14: surrogates not allowed

Command-line interface of the tarfile module also fails:

$ ./python -m tarfile -v -l Lib/test/testtar.tar
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 ustar/conttype 
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 ustar/regtype 
?rwxr-xr-x tarfile/tarfile          0 2003-01-06 01:19:43 ustar/dirtype/ 
?rwxr-xr-x tarfile/tarfile        255 2003-01-06 01:19:43 ustar/dirtype-with-size/ 
?rw-r--r-- tarfile/tarfile          0 2003-01-06 01:19:43 ustar/lnktype link to ustar/regtype 
?rwxrwxrwx tarfile/tarfile          0 2003-01-06 01:19:43 ustar/symtype -> regtype 
?rw-rw---- tarfile/tarfile        3,0 2003-01-06 01:19:43 ustar/blktype 
?rw-rw-rw- tarfile/tarfile        1,3 2003-01-06 01:19:43 ustar/chrtype 
?rw-r--r-- tarfile/tarfile          0 2003-01-06 01:19:43 ustar/fifotype 
?rw-r--r-- tarfile/tarfile      86016 2003-01-06 01:19:43 ustar/sparse 
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/serhiy/py/cpython/Lib/runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "/home/serhiy/py/cpython/Lib/tarfile.py", line 2500, in <module>
    main()
  File "/home/serhiy/py/cpython/Lib/tarfile.py", line 2444, in main
    tf.list(verbose=args.verbose)
  File "/home/serhiy/py/cpython/Lib/tarfile.py", line 1846, in list
    print(tarinfo.name + ("/" if tarinfo.isdir() else ""), end=' ')
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc4' in position 14: surrogates not allowed
?rw-r--r-- tarfile/tarfile       7011 2003-01-06 01:19:43 serhiy@raxxla:~/py/cpython$
History
Date User Action Args
2013-12-07 17:37:40serhiy.storchakasetrecipients: + serhiy.storchaka, lemburg, lars.gustaebel, pitrou, vstinner, benjamin.peterson, ezio.melotti
2013-12-07 17:37:40serhiy.storchakasetmessageid: <1386437860.38.0.0478693655005.issue19920@psf.upfronthosting.co.za>
2013-12-07 17:37:40serhiy.storchakalinkissue19920 messages
2013-12-07 17:37:39serhiy.storchakacreate