This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: small speed-up for tarfile.py when unzipping tarballs
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rosslagerwall Nosy List: jpeel, lars.gustaebel, poq, python-dev, rosslagerwall, serhiy.storchaka
Priority: low Keywords: patch

Created on 2011-09-23 04:48 by jpeel, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
cpython_tarfile.diff jpeel, 2011-09-23 04:48 review
cpython_tarfile2.diff jpeel, 2011-09-23 18:32 review
Messages (6)
msg144436 - (view) Author: Justin Peel (jpeel) Date: 2011-09-23 04:48
Attached small diff speeds up extracting a gzipped tarball on my machine using python 3.2 by 3-5%. It will probably be a larger percentage on machines that have faster hard drives (mine is 5400rpm).

Basically, the changes speed up calculating the checksums by only doing one slice rather than four and call struct.unpack twice rather than four times. We are able to use less unpack calls because 'x' means to skip a byte.
msg144442 - (view) Author: (poq) Date: 2011-09-23 13:04
I don't think you even need the slice, if you use unpack_from.
msg144466 - (view) Author: Justin Peel (jpeel) Date: 2011-09-23 18:32
poq,

You're quite right. I've added that change too. By the way, four unnecessary extra tuples are no longer being created for each call to this function too because of these changes.
msg160931 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-05-16 21:01
Justin, perhaps of interest to the patch would be better if you provide any microbenchmark.
msg160992 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-05-17 17:53
New changeset c62fa6892424 by Ross Lagerwall in branch 'default':
Issue #13031: Small speed-up for tarfile when unzipping tarfiles.
http://hg.python.org/cpython/rev/c62fa6892424
msg160994 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2012-05-17 18:24
Nice work, thanks!
History
Date User Action Args
2022-04-11 14:57:21adminsetgithub: 57240
2012-05-17 18:24:11rosslagerwallsetstatus: open -> closed
messages: + msg160994

assignee: lars.gustaebel -> rosslagerwall
resolution: fixed
stage: resolved
2012-05-17 17:53:35python-devsetnosy: + python-dev
messages: + msg160992
2012-05-16 21:01:27serhiy.storchakasetmessages: + msg160931
2012-05-16 20:33:48rosslagerwallsetnosy: + rosslagerwall
2012-04-07 16:11:56serhiy.storchakasetnosy: + serhiy.storchaka
2011-09-23 18:32:41jpeelsetfiles: + cpython_tarfile2.diff

messages: + msg144466
2011-09-23 16:33:35eric.araujosettitle: [PATCH] small speed-up for tarfile.py when unzipping tarballs -> small speed-up for tarfile.py when unzipping tarballs
2011-09-23 13:04:54poqsetnosy: + poq
messages: + msg144442
2011-09-23 06:46:23lars.gustaebelsetpriority: normal -> low
assignee: lars.gustaebel

nosy: + lars.gustaebel
versions: + Python 3.3, - Python 2.7, Python 3.2
2011-09-23 04:50:04jpeelsettype: performance
2011-09-23 04:48:55jpeelcreate