This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients methane, ned.deily, serhiy.storchaka, sjt
Date 2022-03-20.14:22:14
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1647786134.43.0.53464721257.issue28080@roundup.psfhosted.org>
In-reply-to
Content
I experimented with this a lot. There is a problem with the append mode. We can read in the append mode, therefore we need an encoding. But when we close a ZipFile after appending, non-ASCII file names will be encoded in UTF-8 in the central directory. Next time when we open the archive for reading with different encoding we will get an error because filenames in the central directory and in local headers are different. We need to write non-ASCII files back with the specified encoding to get a self-consistent data.

Finally I left this as it was initially. We can return to the problem with the append module later.

The differences between PR 32007 and your patches:

* The parameter was renamed to metadata_encoding to avoid confusion with existing parameter of ZipFile.open() encoding. In future I am going to use it also for comments. The attribute and the CLI option were renamed correspondingly.
* --metadata-encoding can also be used with the -t option.
* "surrogateescape" no longer used. If the encoding in not suitable, you will get an error. Use the default and decode filenames manually in such cases. We can change this in future.
* Updated documentation.
* Tests were significantly rewritten. Now they test the behavior with wrong metadata_encoding, mixed UTF-8 and legacy encodings, and reading after append.

I was going to make more changes, but left it for future.
History
Date User Action Args
2022-03-20 14:22:14serhiy.storchakasetrecipients: + serhiy.storchaka, ned.deily, sjt, methane
2022-03-20 14:22:14serhiy.storchakasetmessageid: <1647786134.43.0.53464721257.issue28080@roundup.psfhosted.org>
2022-03-20 14:22:14serhiy.storchakalinkissue28080 messages
2022-03-20 14:22:14serhiy.storchakacreate