classification
Title: [RFE] tarfile: add an option to change the "blocking factor"
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: cstratak, lars.gustaebel, nitishch, vstinner
Priority: normal Keywords:

Created on 2017-10-12 14:17 by cstratak, last changed 2018-06-20 13:04 by vstinner.

Files
File name Uploaded Description Edit
tartest.py cstratak, 2017-10-12 14:17
Messages (4)
msg304241 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2017-10-12 14:17
Trying to create an archive with the tarfile module, by specifying a different blocking factor, doesn't seem to work as only the default value is being used. The issue is reproducible on all the active python branches.

Attaching a script to reproduce it.

Original bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1492157
msg304457 - (view) Author: Nitish (nitishch) * Date: 2017-10-16 07:49
Seems like bufsize is used only in streaming modes. Even in the documentation bufsize is described only in the context of streaming modes. 

Even TarFile constructor doesn't take bufsize as an argument. Why is it so?
msg304458 - (view) Author: Nitish (nitishch) * Date: 2017-10-16 07:50
Sorry. My bad. There *is* an argument 'copybufsize' in TarFile.
msg320075 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-20 13:04
Extract of tartest.py:

tarfile.open(name=_tmp_tar, mode='w', bufsize=EXPECTED_SIZE)

Hum, the bufsize argument only applies to the "...|compression" modes (ex: "w|gz"). Moreover, it shouldn't change the size of the tarfile, it only changes the size of parameters of read() and write() arguments. 

Extract of Stream.__write():

    while len(self.buf) > self.bufsize:
        self.fileobj.write(self.buf[:self.bufsize])
        self.buf = self.buf[self.bufsize:]

> Trying to create an archive with the tarfile module, by specifying a different blocking factor, (...)

I didn't know that we can change the "blocking factor". In the UNIX tar command, I see the -b option:

       -b, --blocking-factor=BLOCKS
              Set record size to BLOCKSx512 bytes.


In Lib/tarfile.py, I see:

BLOCKSIZE = 512                 # length of processing blocks
RECORDSIZE = BLOCKSIZE * 20     # length of records

So values are hardcoded in Python.


So this issue is a feature request, not a bug report :-)
History
Date User Action Args
2018-06-20 13:04:55vstinnersetnosy: + vstinner
title: tarfile.open ignores custom bufsize value when creating a new archive -> [RFE] tarfile: add an option to change the "blocking factor"
messages: + msg320075

versions: + Python 3.8, - Python 2.7, Python 3.6, Python 3.7
type: enhancement
2017-10-16 07:50:32nitishchsetmessages: + msg304458
2017-10-16 07:49:20nitishchsetmessages: + msg304457
2017-10-16 07:29:33nitishchsetnosy: + nitishch
2017-10-12 14:17:19cstratakcreate