classification
Title: zipfile: Seeking encrypted file breaks after seeking backwards
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: dhillier, miss-islington, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-10-01 08:17 by dhillier, last changed 2019-10-29 00:13 by dhillier. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 16529 open dhillier, 2019-10-02 00:23
PR 16937 merged serhiy.storchaka, 2019-10-26 21:12
PR 16951 merged miss-islington, 2019-10-27 08:22
PR 16952 merged miss-islington, 2019-10-27 08:22
Messages (10)
msg353646 - (view) Author: Daniel Hillier (dhillier) * Date: 2019-10-01 08:17
Seeking back beyond the decrypted / unzipped buffer doesn't reset the decrypter's crc key values. All data read after seeking back beyond the buffer is garbled.
msg355393 - (view) Author: Daniel Hillier (dhillier) * Date: 2019-10-25 23:35
Hi,

I have another patch I would like to contribute to the zipfile module but would like to request a review of this one to minimise conflicts with later patches. If anyone is able to review the patch, I would really appreciate it :)

Also, with regards to selecting Python versions for this issue, is there any rule of thumb I should be following?

Thanks,
Dan
msg355433 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-10-26 21:13
PR 16937 is simpler. It does not change the decrypter.
msg355436 - (view) Author: Daniel Hillier (dhillier) * Date: 2019-10-26 22:41
Thanks for looking at the PR.


I got carried away refactoring the decrypter for a future scenario where there could be different decrypters (possibly using certificates too) :) Your PR is much simpler.

Would you also be able to take a look at some other PRs I've submitted for zipfile. They are both pretty small changes:

https://bugs.python.org/issue36993
https://bugs.python.org/issue37523

Thanks again,
Dan
msg355446 - (view) Author: Daniel Hillier (dhillier) * Date: 2019-10-27 05:53
I also think that the `read_init` method in my PR is a useful refactor as it locates all the state that needs to be (re)set when starting a read into the same location.

At the moment this state is set in 1) __init__ and 2) the seek method when seeking back beyond what is in the buffer.
msg355453 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-10-27 08:22
New changeset 5c32af7522d908e8c7da0243af37618433289cc5 by Serhiy Storchaka in branch 'master':
bpo-38334: Fix seeking backward on an encrypted zipfile.ZipExtFile. (GH-16937)
https://github.com/python/cpython/commit/5c32af7522d908e8c7da0243af37618433289cc5
msg355455 - (view) Author: miss-islington (miss-islington) Date: 2019-10-27 08:40
New changeset 76fbdaa2a693caaa1b8eb34104720fc774ff80df by Miss Skeleton (bot) in branch '3.8':
bpo-38334: Fix seeking backward on an encrypted zipfile.ZipExtFile. (GH-16937)
https://github.com/python/cpython/commit/76fbdaa2a693caaa1b8eb34104720fc774ff80df
msg355456 - (view) Author: miss-islington (miss-islington) Date: 2019-10-27 08:41
New changeset ed2db3113d54a5f8af1b419a9f5fdf3642285c82 by Miss Skeleton (bot) in branch '3.7':
bpo-38334: Fix seeking backward on an encrypted zipfile.ZipExtFile. (GH-16937)
https://github.com/python/cpython/commit/ed2db3113d54a5f8af1b419a9f5fdf3642285c82
msg355461 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-10-27 10:19
> I got carried away refactoring the decrypter for a future scenario where there could be different decrypters (possibly using certificates too) :)

The decrypter was implemented with a generator for performance. The performance of decrypting is not critical, but this got us 2x speed up without complicating the code. See issue10030. If the support of other decrypters be added, they will likely be implemented in C in any case.

> I also think that the `read_init` method in my PR is a useful refactor as it locates all the state that needs to be (re)set when starting a read into the same location.

This is a different issue.

> Would you also be able to take a look at some other PRs I've submitted for zipfile.

Thank you, they look good to me. I left just few comments.

As for the original issue, I have doubts that backward seeking in a compressed file is a good idea. But this feature was added, and it worked with encrypted files if seek in the range of the buffer, so not working out of the range of the buffer is a bug which we should fix.
msg355604 - (view) Author: Daniel Hillier (dhillier) * Date: 2019-10-29 00:13
Thanks for your help! Good point, I'll create a new change for the refactoring.
History
Date User Action Args
2019-10-29 00:13:04dhilliersetmessages: + msg355604
2019-10-27 10:19:02serhiy.storchakasetstatus: open -> closed
versions: + Python 3.7, Python 3.8
type: enhancement -> behavior
messages: + msg355461

resolution: fixed
stage: patch review -> resolved
2019-10-27 08:41:30miss-islingtonsetmessages: + msg355456
2019-10-27 08:40:47miss-islingtonsetnosy: + miss-islington
messages: + msg355455
2019-10-27 08:22:51miss-islingtonsetpull_requests: + pull_request16480
2019-10-27 08:22:44miss-islingtonsetpull_requests: + pull_request16479
2019-10-27 08:22:19serhiy.storchakasetmessages: + msg355453
2019-10-27 05:53:24dhilliersetmessages: + msg355446
2019-10-26 22:41:11dhilliersetmessages: + msg355436
2019-10-26 21:13:50serhiy.storchakasetmessages: + msg355433
2019-10-26 21:12:30serhiy.storchakasetpull_requests: + pull_request16466
2019-10-25 23:35:44dhilliersetmessages: + msg355393
2019-10-14 00:16:45dhilliersetnosy: + serhiy.storchaka
2019-10-02 00:23:55dhilliersetkeywords: + patch
stage: patch review
pull_requests: + pull_request16120
2019-10-01 08:17:22dhilliercreate