This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ZipFile.comment expects bytes
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.2, Python 3.3, Python 3.4
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, iritkatriel, serhiy.storchaka, swamiyeswanth, vstinner, xuanji
Priority: normal Keywords:

Created on 2011-02-09 15:09 by xuanji, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg128217 - (view) Author: Xuanji Li (xuanji) * Date: 2011-02-09 15:09
The documentation for zipfile describes ZipFile.comment as "The comment text associated with the ZIP file." From reading this I expect that setting it to a string is ok; however ZipFile.comment must actually be set to bytes (or a bytes-like object, I am not very sure).

This may also unexpectedly affect old code because I saw one patch on the bug tracker that was written just last year that set ZipFile.comment to a string. 

IMO there are 2 ways to fix:

1) Change docs to mention that ZipFile.comment only accepts bytes
2) Patch zipfile.py to accept string and try to convert, throwing an error if the conversion fails
msg128478 - (view) Author: yeswanth (swamiyeswanth) Date: 2011-02-13 04:36
IMO the Zipfile.comment should accept strings too instead of just accepting bytes , so patching should help i guess
msg128479 - (view) Author: yeswanth (swamiyeswanth) Date: 2011-02-13 04:41
can we use str.encode() function to convert string into bytes ?
msg128484 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-02-13 09:59
> can we use str.encode() function to convert string into bytes ?

Can you try different ZIP archivers to check which encoding is expected? WinZip, WinRAR, 7-zip, "zip" command line program on Linux, etc.

And do you have any reference into a ZIP documentation?
msg173765 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-25 16:17
The ZIP specification says:

"""
If general purpose bit 11 is unset, the file name and comment should conform 
to the original ZIP character encoding.  If general purpose bit 11 is set, the 
filename and comment must support The Unicode Standard, Version 4.1.0 or 
greater using the character encoding form defined by the UTF-8 storage 
specification.  The Unicode Standard is published by the The Unicode
Consortium (www.unicode.org).  UTF-8 encoded data stored within ZIP files 
is expected to not include a byte order mark (BOM). 
"""

Also there is extension for UTF-8 encoded file comment.  All this means the file comment should be interpreted as an unicode string.

However the specification says nothing about .ZIP file comment (except that encryption or data authentication is applied to it).

Since changeset 4186f20d9fa4 ZipFile.comment raises TypeError on try to assign non-bytes. I think the documentation should be clarified.
msg415561 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2022-03-19 17:01
The documentation has been updated by now [1]:

ZipFile.comment
The comment associated with the ZIP file as a bytes object. If assigning a comment to a ZipFile instance created with mode 'w', 'x' or 'a', it should be no longer than 65535 bytes. Comments longer than this will be truncated.


[1] https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.comment
History
Date User Action Args
2022-04-11 14:57:12adminsetgithub: 55369
2022-03-19 17:01:56iritkatrielsetstatus: open -> closed

nosy: + iritkatriel
messages: + msg415561

resolution: out of date
stage: needs patch -> resolved
2012-10-25 16:17:30serhiy.storchakasettype: enhancement
components: - Library (Lib)
versions: + Python 3.3, Python 3.4
nosy: + serhiy.storchaka

messages: + msg173765
stage: needs patch
2011-02-13 09:59:45vstinnersetnosy: vstinner, docs@python, xuanji, swamiyeswanth
messages: + msg128484
2011-02-13 04:41:51swamiyeswanthsetnosy: vstinner, docs@python, xuanji, swamiyeswanth
messages: + msg128479
2011-02-13 04:36:40swamiyeswanthsetnosy: + swamiyeswanth
messages: + msg128478
2011-02-09 15:09:00xuanjicreate