Title: TarFile expose copyfileobj bufsize to improve throughput
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.6
Status: closed Resolution: fixed
Assigned To: lukasz.langa Nosy List: asvetlov, fried, lars.gustaebel, lukasz.langa, python-dev
Priority: normal Keywords: patch

Created on 2016-06-03 18:55 by fried, last changed 2022-04-11 14:58 by admin. This issue is now closed.

File name Uploaded Description Edit fried, 2016-06-03 18:55 test file to generate two random tar files and test extraction time improvements
copybufsize.patch fried, 2016-06-03 18:56 patch to expose the copy buffer size review
Messages (4)
msg267134 - (view) Author: Jason Fried (fried) * Date: 2016-06-03 18:55
The default of 16k while good for memory usage it is not well suited for all cases. if we increased this to 4MB we saw a pretty large improvement to tar file creation and extraction on linux servers.

For a 1gb tar file containing 1024 random files each of 10MB in size.
Time Delta for TarFile: 146.3240258693695
Time Delta for FastTarFile 4MB copybufsize: 102.76440262794495
Time Diff: 43.55962324142456 0.2976928975444698
msg268234 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2016-06-11 17:27
New feature -> 3.6.
msg275546 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-10 02:50
New changeset 0bac85e355b5 by Łukasz Langa in branch 'default':
Issue #27199: TarFile expose copyfileobj bufsize to improve throughput
msg275547 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2016-09-10 02:51
Thanks for the patch!
