classification
Title: MemoryError on zip.read in shutil._unpack_zipfile
Type: resource usage Stage: commit review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: gregory.p.smith, igorvoltaic, malin, miss-islington, python-dev
Priority: normal Keywords: patch

Created on 2021-03-28 21:21 by igorvoltaic, last changed 2021-05-17 17:36 by gregory.p.smith. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25058 merged python-dev, 2021-03-28 22:13
PR 26190 merged miss-islington, 2021-05-17 08:28
PR 26191 merged miss-islington, 2021-05-17 08:28
Messages (5)
msg389652 - (view) Author: Igor Bolshakov (igorvoltaic) * Date: 2021-03-28 21:21
MemoryError: null
  ...
  File "....", line 13, in repack__file
    shutil.unpack_archive(local_file_path, local_dir)
  File "python3.6/shutil.py", line 983, in unpack_archive
    func(filename, extract_dir, **kwargs)
  File "python3.6/shutil.py", line 901, in _unpack_zipfile
    data = zip.read(info.filename)
  File "python3.6/zipfile.py", line 1338, in read
    return fp.read()
  File "python3.6/zipfile.py", line 858, in read
    buf += self._read1(self.MAX_N)
  File "python3.6/zipfile.py", line 948, in _read1
    data = self._decompressor.decompress(data, n)

shutil.unpack_archive tries to read the whole file into memory, making use of any buffer at all. Python crashes for really large files. In my case — archive: ~1.7G, unpacked: ~10G. Interestingly zipfile.ZipFile.extractall handles this case more effective.
msg393692 - (view) Author: Igor Bolshakov (igorvoltaic) * Date: 2021-05-14 20:42
pls review
msg393819 - (view) Author: miss-islington (miss-islington) Date: 2021-05-17 17:35
New changeset 049c4125f8a2b482c6129db68463f58e20c31526 by Miss Islington (bot) in branch '3.9':
bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058)
https://github.com/python/cpython/commit/049c4125f8a2b482c6129db68463f58e20c31526
msg393820 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-17 17:35
New changeset 7a588621c2854bcef6ce9eeb54349b84ac078c45 by Miss Islington (bot) in branch '3.10':
bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058) (GH-26190)
https://github.com/python/cpython/commit/7a588621c2854bcef6ce9eeb54349b84ac078c45
msg393821 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-05-17 17:36
thanks for the patch!
History
Date User Action Args
2021-05-17 17:36:05gregory.p.smithsetstatus: open -> closed
resolution: remind -> fixed
messages: + msg393821

stage: patch review -> commit review
2021-05-17 17:35:34gregory.p.smithsetmessages: + msg393820
2021-05-17 17:35:10miss-islingtonsetmessages: + msg393819
2021-05-17 08:28:35miss-islingtonsetpull_requests: + pull_request24808
2021-05-17 08:28:29miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24807
2021-05-17 08:08:42gregory.p.smithsetassignee: gregory.p.smith

type: crash -> resource usage
nosy: + gregory.p.smith
versions: + Python 3.10, Python 3.11, - Python 3.6, Python 3.7, Python 3.8
2021-05-15 14:44:52malinsetnosy: + malin
2021-05-14 20:42:10igorvoltaicsetresolution: remind
messages: + msg393692
2021-03-28 22:13:55python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request23805
stage: patch review
2021-03-28 21:21:23igorvoltaiccreate