Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of zipfile.Path causes attempt to write after ZipFile is closed #85522

Closed
NickHenderson mannequin opened this issue Jul 20, 2020 · 9 comments
Closed

Use of zipfile.Path causes attempt to write after ZipFile is closed #85522

NickHenderson mannequin opened this issue Jul 20, 2020 · 9 comments
Labels
3.10 only security fixes performance Performance or resource usage topic-email topic-XML

Comments

@NickHenderson
Copy link
Mannequin

NickHenderson mannequin commented Jul 20, 2020

BPO 41350
Nosy @warsaw, @jaraco, @bitdancer, @tirkarthi, @websurfer5, @rishi93, @mediagatewizard, @akulakov
PRs
  • bpo-41350: handle an already closed file-like object in zipfile.ZipFi… #21604
  • Superseder
  • bpo-40564: Using zipfile.Path with several files prematurely closes zip
  • Files
  • zippath_bug_39.py: Demonstration of issue in Python 3.9
  • zipfile_close_bug.py: simpler demonstration of issue
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-04-24.19:09:29.893>
    created_at = <Date 2020-07-20.16:09:49.329>
    labels = ['expert-XML', 'expert-email', '3.10', 'performance']
    title = 'Use of zipfile.Path causes attempt to write after ZipFile is closed'
    updated_at = <Date 2021-04-24.19:09:45.002>
    user = 'https://bugs.python.org/NickHenderson'

    bugs.python.org fields:

    activity = <Date 2021-04-24.19:09:45.002>
    actor = 'jaraco'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-04-24.19:09:29.893>
    closer = 'jaraco'
    components = ['Demos and Tools', 'XML', 'email']
    creation = <Date 2020-07-20.16:09:49.329>
    creator = 'Nick Henderson'
    dependencies = []
    files = ['49329', '49336']
    hgrepos = []
    issue_num = 41350
    keywords = ['patch']
    message_count = 9.0
    messages = ['374015', '374151', '374170', '374171', '374177', '374196', '374237', '391786', '391792']
    nosy_count = 9.0
    nosy_names = ['barry', 'jaraco', 'r.david.murray', 'xtreak', 'Jeffrey.Kintscher', 'rishi93', 'Nick Henderson', 'angelsb', 'andrei.avk']
    pr_nums = ['21604']
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '40564'
    type = 'resource usage'
    url = 'https://bugs.python.org/issue41350'
    versions = ['Python 3.10']

    @NickHenderson
    Copy link
    Mannequin Author

    NickHenderson mannequin commented Jul 20, 2020

    In both Python 3.8.3 and 3.9.0b3, using zipfile.Path to write a file in a context manager results in an attempt to write to the zip file after it is closed.

    In Python 3.9.0b3:

    import io
    from zipfile import ZipFile, Path
    
    def make_zip():
        """Make zip file and return bytes."""
        bytes_io = io.BytesIO()
        zip_file = ZipFile(bytes_io, mode="w")
        zip_path = Path(zip_file, "file-a")
        
        # use zipp.Path.open
        with zip_path.open(mode="wb") as fp:
            fp.write(b"contents of file-a")
    
        zip_file.close()
    
        data = bytes_io.getvalue()
    
        bytes_io.close()
        
        return data
    
    zip_data = make_zip()
    # Exception ignored in: <function ZipFile.__del__ at 0x10aceff70>
    # Traceback (most recent call last):
    #   File "/Users/nick/.pyenv/versions/3.9.0b3/lib/python3.9/zipfile.py", line 1807, in __del__
    #     self.close()
    #   File "/Users/nick/.pyenv/versions/3.9.0b3/lib/python3.9/zipfile.py", line 1824, in close
    #     self.fp.seek(self.start_dir)
    # ValueError: I/O operation on closed file.

    In Python 3.8.3:

    import io
    from zipfile import ZipFile, Path
    
    def make_zip():
        """Make zip file and return bytes."""
        bytes_io = io.BytesIO()
        zip_file = ZipFile(bytes_io, mode="w")
        zip_path = Path(zip_file, "file-a")
        
        # use zipp.Path.open
        with zip_path.open(mode="w") as fp:
            fp.write(b"contents of file-a")
    
        zip_file.close()
    
        data = bytes_io.getvalue()
    
        bytes_io.close()
        
        return data
    
    zip_data = make_zip()
    # Exception ignored in: <function ZipFile.__del__ at 0x10922e670>
    # Traceback (most recent call last):
    #   File "/Users/nick/.pyenv/versions/3.8.3/lib/python3.8/zipfile.py", line 1820, in __del__
    #     self.close()
    #   File "/Users/nick/.pyenv/versions/3.8.3/lib/python3.8/zipfile.py", line 1837, in close
    #     self.fp.seek(self.start_dir)
    # ValueError: I/O operation on closed file.

    In the Python 3.8 example, mode="w" is used in the open method on zip_path.

    @NickHenderson NickHenderson mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir 3.8 only security fixes 3.9 only security fixes labels Jul 20, 2020
    @websurfer5
    Copy link
    Mannequin

    websurfer5 mannequin commented Jul 23, 2020

    It looks like a copy of the zip_file object is getting created, probably by the Path constructor:

        zip_path = Path(zip_file, "file-a")

    When return is invoked, the copy still has a reference to the now closed bytes_io object which causes the ValueError exception when ZipFile.__del__() calls ZipFile.close(). This is definitely a bug. I'll take a crack at fixing it.

    @websurfer5
    Copy link
    Mannequin

    websurfer5 mannequin commented Jul 24, 2020

    I created a simpler test case that exercises the buggy code. The problem can be reproduced by passing ZipFile.__init__() a file-like object with the write flag (mode "w") and then closing the file-like object before calling ZipFile.close().

    The exception being triggered by the call to ZipFile.__del__() is a side-effect because all ZipFile.__del__() does is call ZipFile.close().

    import io
    from zipfile import ZipFile
    
    bytes_io = io.BytesIO()
    zip_file = ZipFile(bytes_io, mode="w")
    bytes_io.close()
    zip_file.close()  # throws ValueError

    @websurfer5 websurfer5 mannequin added 3.10 only security fixes labels Jul 24, 2020
    @rishi93
    Copy link
    Mannequin

    rishi93 mannequin commented Jul 24, 2020

    Hi all, I have a working patch for this. Could I submit it for review ?

    @jaraco
    Copy link
    Member

    jaraco commented Jul 24, 2020

    I'm a little surprised from Nick's original post that the behavior is triggered. After all, the code already has a protection for a previously-closed zipfile and that value is unconditionally set during close. So how is it that Zipfile.__del__ is being called?

    Jeffrey suggests that

    a copy of the zip_file object is getting created, probably by the Path constructor

    And indeed, I can confirm the ZipFile state is copied into a new object here. The FastLookup and CompleteDirs functionality is for performance optimizations.

    I created a simpler test case that exercises the buggy code.

    Although the simpler test does trigger the error, I'd argue that the error is expected in this case and that the zip file will be invalid because self._write_end_record will never get called.

    I expected this example to more accurately capture the failure:

    import io
    import zipfile
    buf = io.BytesIO()
    zf = zipfile.ZipFile(buf, mode='w')
    zp = zipfile.Path(zf)
    with zp.joinpath('zile-a').open('w') as fp:
        fp.write(b'contents of file-a')
    zf.close()
    zp.root.close()
    

    But it does not. I'll need to do more investigation.

    One option here is for Path to document that any zipfile passed to it is no longer valid and enforce that fact by disabling the original object passed to it.

    Another approach could be to try to use an adapter pattern to adapt the original ZipFile rather than clone it into a subclass.

    @jaraco
    Copy link
    Member

    jaraco commented Jul 24, 2020

    This routine will repro the issue without relying on garbage collection to trigger the error:

    import io
    import zipfile
    buf = io.BytesIO()
    zf = zipfile.ZipFile(buf, mode='w')
    zp = zipfile.Path(zf)
    with zp.joinpath('zile-a').open('w') as fp:
        fp.write('contents of file-a')
    zf.close()
    buf.close()
    zp.root.close()
    

    @mediagatewizard mediagatewizard mannequin added topic-XML topic-email performance Performance or resource usage and removed stdlib Python modules in the Lib dir 3.8 only security fixes 3.9 only security fixes type-bug An unexpected behavior, bug, or error labels Jul 24, 2020
    @tirkarthi
    Copy link
    Member

    Not sure if it's related to this issue there is an existing issue with ZipFile.__del__ : bpo-37773

    @akulakov
    Copy link
    Contributor

    Looks like a duplicate of https://bugs.python.org/issue40564 , which was fixed and closed so this can also be closed.

    @jaraco jaraco closed this as completed Apr 24, 2021
    @jaraco jaraco closed this as completed Apr 24, 2021
    @jaraco
    Copy link
    Member

    jaraco commented Apr 24, 2021

    Thanks Andrei

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes performance Performance or resource usage topic-email topic-XML
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants