Title: [RFE] tarfile: add an option to change the "blocking factor"
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.8
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: cstratak, lars.gustaebel, nitishch, vstinner
Priority: normal Keywords:

Created on 2017-10-12 14:17 by cstratak, last changed 2018-06-20 13:04 by vstinner.

File name Uploaded Description Edit cstratak, 2017-10-12 14:17
Messages (4)
msg304241 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2017-10-12 14:17
Trying to create an archive with the tarfile module, by specifying a different blocking factor, doesn't seem to work as only the default value is being used. The issue is reproducible on all the active python branches.

Attaching a script to reproduce it.

Original bug report:
msg304457 - (view) Author: Nitish (nitishch) * Date: 2017-10-16 07:49
Seems like bufsize is used only in streaming modes. Even in the documentation bufsize is described only in the context of streaming modes. 

Even TarFile constructor doesn't take bufsize as an argument. Why is it so?
msg304458 - (view) Author: Nitish (nitishch) * Date: 2017-10-16 07:50
Sorry. My bad. There *is* an argument 'copybufsize' in TarFile.
msg320075 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-20 13:04
Extract of, mode='w', bufsize=EXPECTED_SIZE)

Hum, the bufsize argument only applies to the "...|compression" modes (ex: "w|gz"). Moreover, it shouldn't change the size of the tarfile, it only changes the size of parameters of read() and write() arguments. 

Extract of Stream.__write():

    while len(self.buf) > self.bufsize:
        self.buf = self.buf[self.bufsize:]

> Trying to create an archive with the tarfile module, by specifying a different blocking factor, (...)

I didn't know that we can change the "blocking factor". In the UNIX tar command, I see the -b option:

       -b, --blocking-factor=BLOCKS
              Set record size to BLOCKSx512 bytes.

In Lib/, I see:

BLOCKSIZE = 512                 # length of processing blocks
RECORDSIZE = BLOCKSIZE * 20     # length of records

So values are hardcoded in Python.

So this issue is a feature request, not a bug report :-)
Date User Action Args
2018-06-20 13:04:55vstinnersetnosy: + vstinner
title: ignores custom bufsize value when creating a new archive -> [RFE] tarfile: add an option to change the "blocking factor"
messages: + msg320075

versions: + Python 3.8, - Python 2.7, Python 3.6, Python 3.7
type: enhancement
2017-10-16 07:50:32nitishchsetmessages: + msg304458
2017-10-16 07:49:20nitishchsetmessages: + msg304457
2017-10-16 07:29:33nitishchsetnosy: + nitishch
2017-10-12 14:17:19cstratakcreate