Author dhillier
Recipients dhillier, iritkatriel, yudilevi
Date 2021-05-25.04:31:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
zipfile decodes filenames using cp437 or unicode and encodes using ascii or unicode. It seems like zipfile has a preference for writing filenames in unicode rather than cp437. Is zipfile's preference for writing filenames in unicode rather than cp437 intentional?

Is the bug you're seeing related to using zipfile to open and rewrite old zips and not being able to open the rewritten files in an old program that doesn't support the unicode flag?

We could address this two ways:
- Change ZipInfo._encodeFilenameFlags() to always encode to cp437 if possible
- Add a flag to write filenames in cp437 or unicode, otherwise the current situation of ascii or unicode

I guess the choice will depend on if preferring unicode rather than cp437 is intentional and if writing filenames in cp437 will break anything (it shouldn't break anything according to Appendix D of

Here's a test for your current patch (I'd probably put it alongside OtherTests.test_read_after_write_unicode_filenames as this test was adapted from that one)

class OtherTests(unittest.TestCase):

    def test_read_after_write_cp437_filenames(self):
        fname = 'test_cp437_é'
        with zipfile.ZipFile(TESTFN2, 'w') as zipfp:
            zipfp.writestr(fname, b'sample')

        with zipfile.ZipFile(TESTFN2) as zipfp:
            zinfo = zipfp.infolist()[0]
            # Ensure general purpose bit 11 (Language encoding flag
            # (EFS)) is unset to indicate the filename is not unicode
            self.assertFalse(zinfo.flag_bits & 0x800)
            self.assertEqual(, b'sample')
Date User Action Args
2021-05-25 04:31:48dhilliersetrecipients: + dhillier, yudilevi, iritkatriel
2021-05-25 04:31:47dhilliersetmessageid: <>
2021-05-25 04:31:47dhillierlinkissue40172 messages
2021-05-25 04:31:46dhilliercreate