This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: reduce tarfile memory footprint
Type: resource usage Stage:
Components: Library (Lib) Versions: Python 3.0
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: lars.gustaebel
Priority: normal Keywords: patch

Created on 2008-02-10 11:44 by lars.gustaebel, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile-memory.diff lars.gustaebel, 2008-02-10 11:43
Messages (2)
msg62248 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2008-02-10 11:43
tarfile.py wastes lots of memory resources. The memory consumption does
not depend on the size of an archive but on the numbers of members in it.
The attached patch reduces memory usage by about 60% and consists of two
independent strategies (each with about 30% reduction):

1. Add __slots__ to the TarInfo class. This was proposed in issue1540385
a while ago but rejected due to backward-compatibility issues.

2. Remove the undocumented buf attribute of the TarInfo class. buf
stores the original 512-byte header block read from the archive. This
was introduced in r45954 and is rather useless except for GNUTYPE_SPARSE
processing. This might as well be a candidate for backporting to 2.6.
msg65462 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2008-04-14 10:08
Checked into the py3k branch as r62337.
History
Date User Action Args
2022-04-11 14:56:30adminsetgithub: 46334
2008-04-14 10:08:23lars.gustaebelsetstatus: open -> closed
resolution: accepted
messages: + msg65462
2008-02-10 11:44:02lars.gustaebelcreate