classification
Title: documentation of ZipFile file name encoding
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Windson Yang, docs@python, gagern, serhiy.storchaka
Priority: normal Keywords: easy

Created on 2016-01-06 00:06 by gagern, last changed 2019-03-10 08:24 by serhiy.storchaka. This issue is now closed.

Messages (4)
msg257567 - (view) Author: Martin von Gagern (gagern) Date: 2016-01-06 00:06
https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write writes:

“Note: There is no official file name encoding for ZIP files. If you have unicode file names, you must convert them to byte strings in your desired encoding before passing them to write(). WinZip interprets all file names as encoded in CP437, also known as DOS Latin.”

I think this is wrong in many ways. Firstly, APPNOTE.TXT used to explicitely define CP437 as the standard, and it's still the standard in the absence of general purpose bit 11 and a more specific description using the 0x0008 Extra Field. On the other hand, we do have that general purpose bit these days, so there are now not just one but two well-defined file name encodings. And thirdly, encoding the string to bytes as suggested will in fact lead to a run time error, since ZipInfo expects to do this conversion itself.

See work towards issue1734346, starting at commit 8e33f316ce14, for details on when this was addressed in the source code.
msg337594 - (view) Author: Windson Yang (Windson Yang) * Date: 2019-03-10 03:36
I can't find the Note in the current document
msg337595 - (view) Author: Windson Yang (Windson Yang) * Date: 2019-03-10 03:36
Please ignore the last message, the docs locate in 3.4 and 3.5
msg337599 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-10 08:24
This not have been removed in issue32035. 3.4 and 3.5 currently take only security fixes.
History
Date User Action Args
2019-03-10 08:24:30serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg337599

resolution: out of date
stage: needs patch -> resolved
2019-03-10 03:36:56Windson Yangsetmessages: + msg337595
2019-03-10 03:36:12Windson Yangsetnosy: + Windson Yang

messages: + msg337594
versions: + Python 3.4, Python 3.5, - Python 3.7, Python 3.8
2018-02-22 23:11:18cheryl.sabellasetkeywords: + easy
versions: + Python 3.7, Python 3.8, - Python 3.5
2016-01-06 01:00:13serhiy.storchakasetstage: needs patch
2016-01-06 00:06:41gagerncreate