classification
Title: ZipFile from 'a'ppend-mode file generates invalid zip
Type: behavior Stage:
Components: Library (Lib), Windows Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BoppreH, paul.moore, serhiy.storchaka, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2017-04-30 01:33 by BoppreH, last changed 2017-12-04 07:33 by serhiy.storchaka.

Messages (5)
msg292616 - (view) Author: (BoppreH) Date: 2017-04-30 01:33
I may be misunderstanding file modes or the `zipfile` library, but

    from zipfile import ZipFile
    ZipFile(open('a.zip', 'ab'), 'a').writestr('f.txt', 'z')

unexpectedly creates an invalid zip file. 7zip is able to open and show the file list, but files inside look empty, and Windows simply says it's invalid. 

Changing the file mode from `ab` to `wb+` fixes the problem, but truncates the file, and `rb+` doesn't create the file. Calling `close` on both the `open` and `ZipFile` doesn't help either. Using `ZipFile(...).open` instead of `writestr` has the same problem.

I could only reproduce this on [Windows 10, Python 3.6.1, 64 bit]. The zip file was proper on [Windows 10, Python 3.3.5, 32 bit], [Windows 10 Bash, Python 3.4.3, 64 bit], and [FreeBSD, Python 3.5.3, 64 bit].

This is my first bug report, so forgive me if I made any mistakes.
msg296123 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-06-15 18:30
This looks a duplicate of issue29094.
msg307529 - (view) Author: (BoppreH) Date: 2017-12-04 00:25
I'm not sure this is a duplicate of issue29094. That issue includes random data at the start of the file, while this issue uses the 'ab' mode solely for a creating the file if it doesn't exist (and thus is either empty or already a valid zip file). It's not clear to me why 'wb' should work but not 'ab' if the file was empty/missing to begin with.

[Windows 10, Python 3.6.3, 64 bit] still has the same problem.

Here's a more complete test case, starting with no existing files:

    from zipfile import ZipFile

    # Append mode:         v
    with open('file.zip', 'ab') as f:
        with ZipFile(f, 'a') as zip:
                zip.writestr('file.txt', 'contents')
    with open('file.zip', 'rb') as f:
        with ZipFile(f, 'r') as zip:
                print(zip.read('file.txt'))
                # Fails with "zipfile.BadZipFile: Bad magic number for file header"

    # Write mode:          v
    with open('file.zip', 'wb') as f:
        with ZipFile(f, 'a') as zip:
                zip.writestr('file.txt', 'contents')
    with open('file.zip', 'rb') as f:
        with ZipFile(f, 'r') as zip:
                print(zip.read('file.txt'))
                # Works.
msg307542 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-04 07:29
The problem is that seek() doesn't work properly with files opened in append mode.

with open('file', 'ab') as f:
    f.write(b'abcd')
    f.seek(0)
    f.write(b'efgh')
    f.flush()

with open('file', 'rb') as f:
    print(f.read())

The result is b'abcdefgh' instead of expected b'efgh'.
msg307543 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-04 07:33
This may be related to issue18876 or issue20082.
History
Date User Action Args
2017-12-04 07:33:52serhiy.storchakasetmessages: + msg307543
2017-12-04 07:29:43serhiy.storchakasetstatus: closed -> open
superseder: Regression in zipfile writing in 2.7.13 ->
messages: + msg307542

resolution: duplicate ->
stage: resolved ->
2017-12-04 00:25:16BoppreHsetmessages: + msg307529
2017-11-09 18:00:35serhiy.storchakasetstatus: pending -> closed
resolution: duplicate
stage: resolved
2017-06-15 18:30:21serhiy.storchakasetstatus: open -> pending
superseder: Regression in zipfile writing in 2.7.13
messages: + msg296123
2017-04-30 11:46:48xiang.zhangsetnosy: + serhiy.storchaka
2017-04-30 01:33:15BoppreHcreate