I'm attempting to make a script to create ZIP archives from files which filenames are encoded in Shift-JIS. However, Python seems to limit its filenames to ASCII or UTF-8, which means that attempting to archive said files will raise an exception. This is very inconvenient.
joel@bliss:~/test$ python3 -m zipfile -c mojibake.zip .
Traceback (most recent call last):
File "/usr/lib/python3.8/zipfile.py", line 457, in _encodeFilenameFlags
return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.8/zipfile.py", line 2441, in <module>
main()
File "/usr/lib/python3.8/zipfile.py", line 2437, in main
addToZip(zf, path, zippath)
File "/usr/lib/python3.8/zipfile.py", line 2426, in addToZip
addToZip(zf,
File "/usr/lib/python3.8/zipfile.py", line 2421, in addToZip
zf.write(path, zippath, ZIP_DEFLATED)
File "/usr/lib/python3.8/zipfile.py", line 1775, in write
with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
File "/usr/lib/python3.8/zipfile.py", line 1517, in open
return self._open_to_write(zinfo, force_zip64=force_zip64)
File "/usr/lib/python3.8/zipfile.py", line 1614, in _open_to_write
self.fp.write(zinfo.FileHeader(zip64))
File "/usr/lib/python3.8/zipfile.py", line 447, in FileHeader
filename, flag_bits = self._encodeFilenameFlags()
File "/usr/lib/python3.8/zipfile.py", line 459, in _encodeFilenameFlags
return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-7: surrogates not allowed
The zip command from the Linux Info-ZIP package is able to create the same archive with no issues, which I've attached to this issue. Here you can see how the proper filenames are shown in WinRAR once the right encoding is selected: https://i.imgur.com/TVcI95A.png
The same should be seen on any computer using Shift-JIS as their locale.
|