New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zipfile increase in size #72905
Comments
I am the current maintainer of WebOb, and noticed that on Python 3.6 and 3.7 I noticed that a test started failing. Granted, the test is checking the size of the file created and it is not the brightest idea in a test, but it's been stable since Python 2.5... https://travis-ci.org/Pylons/webob/jobs/176505096#L224 shows the failure. _________________________ test_response_file_body_tell _________________________
def test_response_file_body_tell():
import zipfile
from webob.response import ResponseBodyFile
rbo = ResponseBodyFile(Response())
assert rbo.tell() == 0
writer = zipfile.ZipFile(rbo, 'w')
writer.writestr('zinfo_or_arcname', b'foo')
writer.close()
> assert rbo.tell() == 133
E assert 145 == 133
E + where 145 = <bound method ResponseBodyFile.tell of <body_file for <Response at 0x7fa6291f9eb8 200 OK>>>()
E + where <bound method ResponseBodyFile.tell of <body_file for <Response at 0x7fa6291f9eb8 200 OK>>> = <body_file for <Response at 0x7fa6291f9eb8 200 OK>>.tell
tests/test_response.py:608: AssertionError I am not sure that this is necessarily a bug, but it would be good to know why files are no longer generated the same way. |
Could you get a dump of rbo data? |
It's literally the string written: writer.writestr('zinfo_or_arcname', b'foo') rbo in this case is a simple file like object. I can get dumps from Python 3.5 and Python 3.6 if necessary. |
Please make a dump. It should include not just literally the string written, but headers and other special fields. I tried with rbo = io.BytesIO(), and get rbo.tell() == 133. Should be a difference between io.BytesIO and ResponseBodyFile. Maybe ResponseBodyFile is not seekable. |
Here's a dump from Python 3.6: b'PK\x03\x04\x14\x00\x08\x00\x00\x00\xc0~pI\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x10\x00\x00\x00zinfo_or_arcnamefoo!es\x8c\x03\x00\x00\x00\x03\x00\x00\x00PK\x01\x02\x14\x03\x14\x00\x08\x00\x00\x00\xc0~pI!es\x8c\x03\x00\x00\x00\x03\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x01\x00\x00\x00\x00zinfo_or_arcnamePK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00>\x00\x00\x00=\x00\x00\x00\x00\x00' You are correct that ResponseBodyFile does not have a seek() method and is not seekable. Adding seek() to ResponseBodyFile might be a little more complicated... |
If the output file is not seekable, zipfile sets bit 3 in file header flags and writes 12 or 20 (if ZIP64 extension is used) additional bytes after the compressed data. These bytes contain the CRC, compressed and uncompressed sizes. Corresponding fields in local file header are set to zero. In case of writestr() this can be considered as a regression, since the CRC and sizes can be calculated before writing compressed data and saved in local file header. But it would be not easy to fix this. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: