classification
Title: Use of zipfile.Path causes attempt to write after ZipFile is closed
Type: resource usage Stage: patch review
Components: Demos and Tools, email, XML Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jeffrey.Kintscher, Nick Henderson, angelsb, barry, jaraco, r.david.murray, rishi93, xtreak
Priority: normal Keywords: patch

Created on 2020-07-20 16:09 by Nick Henderson, last changed 2020-07-25 04:42 by xtreak.

Files
File name Uploaded Description Edit
zippath_bug_39.py Nick Henderson, 2020-07-20 16:09 Demonstration of issue in Python 3.9
zipfile_close_bug.py Jeffrey.Kintscher, 2020-07-24 10:05 simpler demonstration of issue
Pull Requests
URL Status Linked Edit
PR 21604 open Jeffrey.Kintscher, 2020-07-24 11:27
Messages (7)
msg374015 - (view) Author: Nick Henderson (Nick Henderson) Date: 2020-07-20 16:09
In both Python 3.8.3 and 3.9.0b3, using zipfile.Path to write a file in a context manager results in an attempt to write to the zip file after it is closed.

In Python 3.9.0b3:

import io
from zipfile import ZipFile, Path

def make_zip():
    """Make zip file and return bytes."""
    bytes_io = io.BytesIO()
    zip_file = ZipFile(bytes_io, mode="w")
    zip_path = Path(zip_file, "file-a")
    
    # use zipp.Path.open
    with zip_path.open(mode="wb") as fp:
        fp.write(b"contents of file-a")

    zip_file.close()

    data = bytes_io.getvalue()

    bytes_io.close()
    
    return data

zip_data = make_zip()
# Exception ignored in: <function ZipFile.__del__ at 0x10aceff70>
# Traceback (most recent call last):
#   File "/Users/nick/.pyenv/versions/3.9.0b3/lib/python3.9/zipfile.py", line 1807, in __del__
#     self.close()
#   File "/Users/nick/.pyenv/versions/3.9.0b3/lib/python3.9/zipfile.py", line 1824, in close
#     self.fp.seek(self.start_dir)
# ValueError: I/O operation on closed file.


In Python 3.8.3:

import io
from zipfile import ZipFile, Path

def make_zip():
    """Make zip file and return bytes."""
    bytes_io = io.BytesIO()
    zip_file = ZipFile(bytes_io, mode="w")
    zip_path = Path(zip_file, "file-a")
    
    # use zipp.Path.open
    with zip_path.open(mode="w") as fp:
        fp.write(b"contents of file-a")

    zip_file.close()

    data = bytes_io.getvalue()

    bytes_io.close()
    
    return data

zip_data = make_zip()
# Exception ignored in: <function ZipFile.__del__ at 0x10922e670>
# Traceback (most recent call last):
#   File "/Users/nick/.pyenv/versions/3.8.3/lib/python3.8/zipfile.py", line 1820, in __del__
#     self.close()
#   File "/Users/nick/.pyenv/versions/3.8.3/lib/python3.8/zipfile.py", line 1837, in close
#     self.fp.seek(self.start_dir)
# ValueError: I/O operation on closed file.

In the Python 3.8 example, mode="w" is used in the open method on zip_path.
msg374151 - (view) Author: Jeffrey Kintscher (Jeffrey.Kintscher) * Date: 2020-07-23 22:18
It looks like a copy of the zip_file object is getting created, probably by the Path constructor:

    zip_path = Path(zip_file, "file-a")

When return is invoked, the copy still has a reference to the now closed bytes_io object which causes the ValueError exception when ZipFile.__del__() calls ZipFile.close().  This is definitely a bug.  I'll take a crack at fixing it.
msg374170 - (view) Author: Jeffrey Kintscher (Jeffrey.Kintscher) * Date: 2020-07-24 10:05
I created a simpler test case that exercises the buggy code.  The problem can be reproduced by passing ZipFile.__init__() a file-like object with the write flag (mode "w") and then closing the file-like object before calling ZipFile.close().

The exception being triggered by the call to ZipFile.__del__() is a side-effect because all ZipFile.__del__() does is call ZipFile.close().



import io
from zipfile import ZipFile

bytes_io = io.BytesIO()
zip_file = ZipFile(bytes_io, mode="w")
bytes_io.close()
zip_file.close()  # throws ValueError
msg374171 - (view) Author: Rajarishi Devarajan (rishi93) * Date: 2020-07-24 10:34
Hi all, I have a working patch for this. Could I submit it for review ?
msg374177 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2020-07-24 14:00
I'm a little surprised from Nick's original post that the behavior is triggered. After all, the code [already has a protection for a previously-closed zipfile](https://github.com/python/cpython/blob/0dd98c2d00a75efbec19c2ed942923981bc06683/Lib/zipfile.py#L1812-L1813) and that value is [unconditionally set during close](https://github.com/python/cpython/blob/0dd98c2d00a75efbec19c2ed942923981bc06683/Lib/zipfile.py#L1828). So how is it that Zipfile.__del__ is being called?

Jeffrey suggests that

> a copy of the zip_file object is getting created, probably by the Path constructor

And indeed, I can confirm the ZipFile [state is copied into a new object here](https://github.com/python/cpython/blob/0dd98c2d00a75efbec19c2ed942923981bc06683/Lib/zipfile.py#L2202). The FastLookup and CompleteDirs functionality is for performance optimizations.

> I created a simpler test case that exercises the buggy code.

Although the simpler test does trigger the error, I'd argue that the error is _expected_ in this case and that the zip file will be invalid because `self._write_end_record` will never get called.

I expected this example to more accurately capture the failure:

```
import io
import zipfile
buf = io.BytesIO()
zf = zipfile.ZipFile(buf, mode='w')
zp = zipfile.Path(zf)
with zp.joinpath('zile-a').open('w') as fp:
    fp.write(b'contents of file-a')
zf.close()
zp.root.close()
```

But it does not. I'll need to do more investigation.

One option here is for `Path` to document that any zipfile passed to it is no longer valid and enforce that fact by disabling the original object passed to it.

Another approach could be to try to use an adapter pattern to adapt the original ZipFile rather than clone it into a subclass.
msg374196 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2020-07-24 16:30
This routine will repro the issue without relying on garbage collection to trigger the error:

```
import io
import zipfile
buf = io.BytesIO()
zf = zipfile.ZipFile(buf, mode='w')
zp = zipfile.Path(zf)
with zp.joinpath('zile-a').open('w') as fp:
    fp.write('contents of file-a')
zf.close()
buf.close()
zp.root.close()
```
msg374237 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-07-25 04:42
Not sure if it's related to this issue there is an existing issue with ZipFile.__del__ : issue37773
History
Date User Action Args
2020-07-25 04:42:30xtreaksetnosy: + xtreak
messages: + msg374237
2020-07-24 16:48:19angelsbsetnosy: + r.david.murray, angelsb, barry

type: behavior -> resource usage
components: + Demos and Tools, XML, email, - Library (Lib)
versions: - Python 3.8, Python 3.9
2020-07-24 16:30:21jaracosetmessages: + msg374196
2020-07-24 14:00:37jaracosetmessages: + msg374177
2020-07-24 11:27:35Jeffrey.Kintschersetkeywords: + patch
stage: patch review
pull_requests: + pull_request20745
2020-07-24 10:34:34rishi93setnosy: + rishi93
messages: + msg374171
2020-07-24 10:08:28Jeffrey.Kintschersetversions: + Python 3.10
2020-07-24 10:05:45Jeffrey.Kintschersetfiles: + zipfile_close_bug.py

messages: + msg374170
2020-07-23 22:18:29Jeffrey.Kintschersetmessages: + msg374151
2020-07-23 07:02:28Jeffrey.Kintschersetnosy: + Jeffrey.Kintscher
2020-07-20 16:35:16xtreaksetnosy: + jaraco
2020-07-20 16:09:49Nick Hendersoncreate