classification
Title: ValueError: I/O operation on closed file. in ZipFile destructor
Type: crash Stage:
Components: Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: joernheissler, pitrou, serhiy.storchaka, vstinner, xtreak
Priority: normal Keywords: 3.8regression

Created on 2019-08-06 09:35 by joernheissler, last changed 2021-03-29 12:22 by vstinner.

Messages (8)
msg349104 - (view) Author: Jörn Heissler (joernheissler) * Date: 2019-08-06 09:35
When running this code:

from zipfile import ZipFile
import io

def foo():
    pass

data = io.BytesIO()
zf = ZipFile(data, "w")


I get this message:

Exception ignored in: <function ZipFile.__del__ at 0x7f9005caa160>
Traceback (most recent call last):
  File "/home/user/git/oss/cpython/Lib/zipfile.py", line 1800, in __del__
  File "/home/user/git/oss/cpython/Lib/zipfile.py", line 1817, in close
ValueError: I/O operation on closed file.

Comment out def foo: pass, and there is no error.

It looks like the bug was introduced with commit ada319bb6d0ebcc68d3e0ef2b4279ea061877ac8 (bpo-32388).
msg349105 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-08-06 09:41
This looks like a regression in 3.8 so I have added 3.8 regression tag.
msg389594 - (view) Author: Jörn Heissler (joernheissler) * Date: 2021-03-27 10:11
Still reproducible in cpython 3.10.0a3 (debian unstable) and 3.10.0a6 (pyenv).
msg389605 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-03-27 15:28
That's what happened. Function foo creates a reference loop. It has reference to the module dict, and the dict has reference to the function. The dict has also references to BytesIO and ZipFile objects. At shutdown stage the garbage collector is called. It founds the reference loop which contains the module dict, function foo and BytesIO and ZipFile objects. It tries to break the loop by calling the destructor of arbitrary object in the loop. In that case it is the BytesIO object. It does not help, but finally the destructor of the ZipFile object is called (either directly or after breaking the loop and destroying the module dict). The ZipFile destructor tries to close the BytesIO object, but it is already closed in its destructor.

It is a dangerous situation which potentially can cause data corruption (if the underlying stream is closed before flushing buffers). Clearing the module dict before invoking garbage collecting would solve this issue. Making the garbage collector more clever and and avoid destroying objects if it does not break the loop would help too, but it is more complex problem and such change could not be backported to 3.8.

Victor, were there any changes in the garbage collector or interpreter shutdown code in 3.8?
msg389611 - (view) Author: Jörn Heissler (joernheissler) * Date: 2021-03-27 17:16
Thanks Serhiy for the explanation, it's making sense now.

Guess whatever I did back then (no idea what I was working on) was basically a mistake; I should have closed my ZipFile properly, e.g. by using context managers. So maybe it's not really a cpython bug.

I ran a git-bisect back then and came up with https://hg.python.org/lookup/ada319bb6d0ebcc68d3e0ef2b4279ea061877ac8 as cause.
msg389689 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-29 12:18
> Victor, were there any changes in the garbage collector or interpreter shutdown code in 3.8?

I took some notes there: https://pythondev.readthedocs.io/finalization.html

I proposed bpo-42671 "Make the Python finalization more deterministic" but it will break some use cases, so I'm not comfortable with this idea yet :-(
msg389690 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-29 12:19
IMO ZipFile destructor should emit a ResourceWarning if it's not closed explicitly.

Nothing in the Python specification gives you any warranty that destructors will be ever called... Depending on the Python implementation and depending on many things, you never know when a destructor is called.

Just call the close() method explicitly.
msg389691 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-29 12:22
FYI TextIOWrapper has a similar issue: bpo-17852. There was an attempt to fix the issue in 2017, but it had to be reverted.
History
Date User Action Args
2021-03-29 12:22:42vstinnersetmessages: + msg389691
2021-03-29 12:19:54vstinnersetmessages: + msg389690
2021-03-29 12:18:11vstinnersetmessages: + msg389689
2021-03-27 17:16:45joernheisslersetmessages: + msg389611
2021-03-27 15:28:19serhiy.storchakasetnosy: + vstinner
messages: + msg389605
2021-03-27 10:11:53joernheisslersetmessages: + msg389594
versions: + Python 3.10
2019-08-06 10:51:16serhiy.storchakasetnosy: + serhiy.storchaka
2019-08-06 09:41:22xtreaksetkeywords: + 3.8regression
nosy: + xtreak
messages: + msg349105

2019-08-06 09:35:50joernheisslercreate