msg397483 - (view) |
Author: Christian Steinmeyer (christian.steinmeyer) |
Date: 2021-07-14 14:48 |
When executing the code below with the attached zip file (or any other that has one or more files directly at root level), I get a "ValueError: seek of closed file". It seems, the zipfile handle being part of the `TestClass` instance is being closed, when the `zipfile.Path` is garbage collected, when it is no longer referenced. Since `zipfile.Path` even takes a `zipfile.Zipfile` as an argument, I don't think it is intended? It surprised me at least.
```
import zipfile
class TestClass:
def __init__(self, path):
self.zip_file = zipfile.ZipFile(path)
def iter_dir(self):
return [each.name for each in zipfile.Path(self.zip_file).iterdir()]
def read(self, filename):
with self.zip_file.open(filename) as file:
print(file.read())
root = "zipfile.zip"
test = TestClass(root)
files = test.iter_dir()
test.read(files[0])
```
|
msg397485 - (view) |
Author: Karthikeyan Singaravelan (xtreak) * |
Date: 2021-07-14 15:26 |
This seems similar to https://bugs.python.org/issue40564
|
msg397520 - (view) |
Author: Jack DeVries (jack__d) * |
Date: 2021-07-15 00:43 |
I'm not able to reproduce this on my machine; the script runs without any issue.
> the `TestClass` instance is being closed
What do you mean by this statement? You aren't doing anything to TestClass or its instance ("test") in this script. They remain in scope, so they will always be referenced.
|
msg397524 - (view) |
Author: Christian Steinmeyer (christian.steinmeyer) |
Date: 2021-07-15 07:11 |
I work on macOS 11.4 (20F71) (Kernel Version: Darwin 20.5.0).
My python version is 3.8.9 and zipp is at 3.5.0 (but 3.4.1 behaves the same for me).
For me, this is behavior is reproducible.
Let me try to clarify what I mean.
test = TestClass(root) # this creates a zipfile handle (an instance of zipfile.ZipFile) at test.zip_file
files = test.iter_dir() # this creates multiple instances of zipfile.Path() as part of the list comprehension and these are deferenced afterwards. I found that test.zip_file.fp is closed after this line executes, which to me suggests that the closing of the zipfile.Path also closes the zipfile.ZipFile that was used to create the zipfile.Path.
test.read(files[0]) # this should in theory try to read from the test.zip_file for the first time, but fails because it is closed as per the above.
Here is the full stack trace:
Traceback (most recent call last):
File "test.py", line 20, in <module>
test.read(files[0])
File "test.py", line 12, in read
with self.zip_file.open(filename) as file:
File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/zipfile.py", line 1530, in open
fheader = zef_file.read(sizeFileHeader)
File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/zipfile.py", line 763, in read
self._file.seek(self._pos)
ValueError: seek of closed file
|
msg397590 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 00:26 |
I was able to replicate the error using the script as posted:
```
draft $ cat > issue44638.py
import zipfile
class TestClass:
def __init__(self, path):
self.zip_file = zipfile.ZipFile(path)
def iter_dir(self):
return [each.name for each in zipfile.Path(self.zip_file).iterdir()]
def read(self, filename):
with self.zip_file.open(filename) as file:
print(file.read())
root = "zipfile.zip"
test = TestClass(root)
files = test.iter_dir()
test.read(files[0])
draft $ python -m zipfile -c zipfile.zip issue44638.py
draft $ python issue44638.py
Traceback (most recent call last):
File "/Users/jaraco/draft/issue44638.py", line 18, in <module>
test.read(files[0])
File "/Users/jaraco/draft/issue44638.py", line 12, in read
with self.zip_file.open(filename) as file:
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1518, in open
fheader = zef_file.read(sizeFileHeader)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 741, in read
self._file.seek(self._pos)
ValueError: seek of closed file
```
|
msg397591 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 00:39 |
Here's a much simpler repro that avoids the class construction but triggers the same error:
```
import zipfile
zip_file = zipfile.ZipFile('zipfile.zip')
names = [each.name for each in zipfile.Path(zip_file).iterdir()]
with zip_file.open(names[0]) as file:
print(file.read())
```
|
msg397592 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 00:40 |
Even simpler:
```
import zipfile
zip_file = zipfile.ZipFile('zipfile.zip')
names = [each.name for each in zipfile.Path(zip_file).iterdir()]
zip_file.open(names[0])
```
|
msg397593 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 00:55 |
This also reproduces the failure:
```
zip_file = zipfile.ZipFile('zipfile.zip')
path = zipfile.Path(zip_file)
name = zip_file.namelist()[0]
del path
zip_file.open(name)
```
Removing `del path` bypasses the issue. Something about the destructor for zipfile.Path is causing the closing of the handle for zip_file.
|
msg397594 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 01:00 |
Even simpler:
```
zip_file = zipfile.ZipFile('zipfile.zip')
name = zip_file.namelist()[0]
zipfile.Path(zip_file)
zip_file.open(name)
```
|
msg397595 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 01:11 |
Changing the repro to:
```
import zipfile
try:
import zipp
except ImportError:
import zipfile as zipp
zip_file = zipfile.ZipFile('zipfile.zip')
name = zip_file.namelist()[0]
zipp.Path(zip_file)
zip_file.open(name)
```
I'm able now to test against zipfile or zipp. And I notice that the issue occurs only on zipp<3.2 or Python<3.10.
```
draft $ pip-run -q 'zipp<3.3' -- issue44638.py
draft $ pip-run -q 'zipp<3.2' -- issue44638.py
Traceback (most recent call last):
File "/Users/jaraco/draft/issue44638.py", line 11, in <module>
zip_file.open(name)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1518, in open
fheader = zef_file.read(sizeFileHeader)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 741, in read
self._file.seek(self._pos)
ValueError: seek of closed file
```
```
draft $ python3.10 issue44638.py
draft $ python3.9 issue44638.py
Traceback (most recent call last):
File "/Users/jaraco/draft/issue44638.py", line 11, in <module>
zip_file.open(name)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 1518, in open
fheader = zef_file.read(sizeFileHeader)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/zipfile.py", line 741, in read
self._file.seek(self._pos)
ValueError: seek of closed file
```
Looking at the changelog (https://zipp.readthedocs.io/en/latest/history.html#v3-2-0), it's clear now that this issue is a duplicate of bpo-40564 and the problem goes away using the original repro and Python 3.10:
```
draft $ cat > issue44638.py
import zipfile
class TestClass:
def __init__(self, path):
self.zip_file = zipfile.ZipFile(path)
def iter_dir(self):
return [each.name for each in zipfile.Path(self.zip_file).iterdir()]
def read(self, filename):
with self.zip_file.open(filename) as file:
print(file.read())
root = "zipfile.zip"
test = TestClass(root)
files = test.iter_dir()
test.read(files[0])
draft $ python3.10 issue44638.py
b'import zipfile\n\n\nclass TestClass:\n def __init__(self, path):\n self.zip_file = zipfile.ZipFile(path)\n\n def iter_dir(self):\n return [each.name for each in zipfile.Path(self.zip_file).iterdir()]\n\n def read(self, filename):\n with self.zip_file.open(filename) as file:\n print(file.read())\n\nroot = "zipfile.zip"\ntest = TestClass(root)\nfiles = test.iter_dir()\ntest.read(files[0])\n'
```
The solution is to use zipp>=3.2 or Python 3.10.
|
msg397596 - (view) |
Author: Jason R. Coombs (jaraco) * |
Date: 2021-07-16 01:16 |
> My python version is 3.8.9 and zipp is at 3.5.0 (but 3.4.1 behaves the same for me).
It's not enough to have `zipp` 3.5.0. You need to use `zipp.Path` over `zipfile.Path`.
|
msg397601 - (view) |
Author: Christian Steinmeyer (christian.steinmeyer) |
Date: 2021-07-16 07:01 |
Thank you for the in depth look Jason!
Especially that last comment was very useful to me. Perhaps it would make sense to add something like this to the documentation of zipfile.
I'm not sure what would be the best hint, but perhaps in zipfile.Path's documentation a hint that zipp.Path can be used to access newer functionality even for older python versions (if what I understand is correct) might be useful to others as well. Because as of now, I cannot find an equivalent hint yet.
|
msg397620 - (view) |
Author: miss-islington (miss-islington) |
Date: 2021-07-16 13:15 |
New changeset 29358e93f2bb60983271c14ce4c2f3eab35a60ca by Jason R. Coombs in branch 'main':
bpo-44638: Add a reference to the zipp project and hint as to how to use it. (GH-27188)
https://github.com/python/cpython/commit/29358e93f2bb60983271c14ce4c2f3eab35a60ca
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:47 | admin | set | github: 88804 |
2021-07-16 13:15:04 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg397620
|
2021-07-16 12:57:57 | jaraco | set | pull_requests:
+ pull_request25724 |
2021-07-16 07:01:22 | christian.steinmeyer | set | messages:
+ msg397601 |
2021-07-16 01:16:45 | jaraco | set | messages:
+ msg397596 |
2021-07-16 01:12:12 | jaraco | set | status: open -> closed superseder: Using zipfile.Path with several files prematurely closes zip resolution: duplicate stage: resolved |
2021-07-16 01:11:49 | jaraco | set | messages:
+ msg397595 |
2021-07-16 01:00:15 | jaraco | set | messages:
+ msg397594 |
2021-07-16 00:55:55 | jaraco | set | messages:
+ msg397593 |
2021-07-16 00:40:57 | jaraco | set | messages:
+ msg397592 |
2021-07-16 00:39:26 | jaraco | set | messages:
+ msg397591 |
2021-07-16 00:26:56 | jaraco | set | messages:
+ msg397590 |
2021-07-15 07:11:29 | christian.steinmeyer | set | messages:
+ msg397524 |
2021-07-15 00:43:44 | jack__d | set | nosy:
+ jack__d messages:
+ msg397520
|
2021-07-14 15:26:08 | xtreak | set | nosy:
+ jaraco, xtreak messages:
+ msg397485
|
2021-07-14 14:48:27 | christian.steinmeyer | create | |