This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: zipfile module doesn't properly compress odt documents
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: Nosy List: SilentGhost, alanmcintyre, r.david.murray, rai
Priority: normal Keywords:

Created on 2014-06-07 13:02 by rai, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
example.odt rai, 2014-06-07 13:02
Messages (6)
msg219933 - (view) Author: Raimondo Giammanco (rai) Date: 2014-06-07 13:02
Steps to reproduce
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
-1- Create a document.odt containing an input (text) field and a conditional text field; the latter will show a different text based upon the content of the input text field. [use attached example.odt]
-2- Edit the file by means of following code

from zipfile import ZipFile, ZIP_DEFLATED
document = '/tmp/example.odt'                     # SET ME PLEASE
S2b, R2b = 'SUBST'.encode(), 'REPLACEMENT'.encode()
with ZipFile(document,'a', ZIP_DEFLATED) as z:
	xmlString = z.read('content.xml')
	xmlString = xmlString.replace(S2b, R2b)
	z.writestr('content.xml', xmlString)

-3- Open example.odt with *office

As `REPLACEMENT' is the requested string, one expect to see the relevant conditional text
What happens: the LO function doesn't recognize the string, unless one do not retype it manually

Omitting ZIP_DEFLATED parameter prevents this behaviour from happen (so letting zipfile use the default no-compression method) 


tested on
Python 2.7.3 and Python 3.2.3
Ubuntu 12.04 amd64
LibreOffice Version 4.0.4.2
msg219935 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2014-06-07 13:18
Raimondo, the documentation clearly states that the compression method is either inherited from ZipInfo instance (when that one is passed) or set to ZIP_STORED otherwise. Since you're not passing ZipInfo instance, but the string (as the first argument to .writestr), therefore the compression method is set to ZIP_STORED. If you're not set it to ZIP_DEFLATED explicitly, it would work as you expect it. In either case, this behaviour is in accordance with the documentation.
msg220004 - (view) Author: Raimondo Giammanco (rai) Date: 2014-06-07 23:57
SilentGhost, thank you for your reply but I am probably missing something with it. Maybe there is some misunderstanding because of my unclear report. Please let me sum up my point and excuse some repetitiveness

From the documentation of .writestr:
``If given, compress_type overrides the value given for the compression parameter to the constructor for the new entry``
I believed to understand that .writestr would have used the same compression_type passed creating the `z' instance. So, having already passed ZIP_DEFLATED to the constructor, in my opinion, passing it again would have been an useless repetition.

However, as per your suggestion,I tried to explicitly pass ZIP_DEFLATED to .writestr too:

from zipfile import ZipFile, ZIP_DEFLATED
document = '/tmp/example.odt'
S2b, R2b = 'SUBST'.encode(), 'REPLACEMENT'.encode()
with ZipFile(document,'a', ZIP_DEFLATED) as z:
	xmlString = z.read('content.xml')
	xmlString = xmlString.replace(S2b, R2b)
    z.writestr('content.xml', xmlString, ZIP_DEFLATED)

but to no avail: with and without passing ZIP_DEFLATED to .writestr the odt documents lose the feature explained in my first post

AFAICT, the only way to keep a fully functional odt document is not to compress it (no ZIP_DEFLATED at all), as cited in my previous post 

with ZipFile(document,'a') as z:
	xmlString = z.read('content.xml')
	xmlString = xmlString.replace(S2b, R2b)
    z.writestr('content.xml', xmlString)
msg220026 - (view) Author: SilentGhost (SilentGhost) * (Python triager) Date: 2014-06-08 08:16
Whether for reasons of slightly different setup or due to something else, I'm not able to reproduce the issue. What I do see, is that the field is not automatically updated, so on opening of the document I have to hit F9 to get the "answer" field updated. That doesn't, however, seem at all related to the compression method (that is I do get the same behaviour for any combination of compression level values). Perhaps someone else would have a better idea.
msg220102 - (view) Author: Raimondo Giammanco (rai) Date: 2014-06-09 17:11
hit F9 ?!? 
I feel ashamed. The need to recalculate the fields simply slipped my mind. 
Of course, in some way Writer has to be told about the strings replacement. Maybe could my fault be partially justifiable if one consider the autoupdate behaviour with the uncompressed documents?

Anyway, the workaround to zip without compression is ok for me as LibreOffice actually will compress the document on first saving.
msg220110 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-09 19:08
So if I'm understanding correctly the python update to the file happens correctly in both cases, and the issue with the update not being immediately visible is an issue on the OpenOffice side of things.  So I'm closing this as a 3rd party bug (though it sounds like it may not really be a bug).
History
Date User Action Args
2022-04-11 14:58:04adminsetgithub: 65884
2014-06-09 19:08:47r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg220110

resolution: third party
stage: resolved
2014-06-09 17:11:06raisetmessages: + msg220102
2014-06-08 08:16:32SilentGhostsetnosy: + alanmcintyre
messages: + msg220026
2014-06-07 23:57:29raisetmessages: + msg220004
2014-06-07 13:18:52SilentGhostsetnosy: + SilentGhost
messages: + msg219935
2014-06-07 13:02:05raicreate