classification
Title: allow setting uid and gid when creating tar files
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: lars.gustaebel, tarek
Priority: normal Keywords: patch

Created on 2009-09-07 07:47 by tarek, last changed 2009-09-12 10:52 by lars.gustaebel. This issue is now closed.

Files
File name Uploaded Description Edit
issue6856.diff lars.gustaebel, 2009-09-07 16:25
Messages (6)
msg92348 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-09-07 07:47
I am proposing this feature for an issue we have in Distutils: being
able to set the uid/gid of files added in a tar archive using tarfile.

Here's what I am proposing:

- adding two methods to TarInfo: set_uid and set_gid, that are able to
take a user and group name *or* a uid and gid number

- adding in TarFile a new filter option to add() called include. If
given, it's a callable that receives the tarinfo object right before
it's added, so its uid/gid can be tweaked. This callable must return the
object. If it returns None, the object is not added to the tar file.
msg92361 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-09-07 11:52
TarInfo does not need set_uid() or set_gid() methods, both can be set
using the uid and gid attributes.
If the list of files to add to the archive is known you can do this:

tar = tarfile.open("foo.tar.gz", "w:gz")
for filename in filenames:
  tarinfo = tar.gettarinfo(filename)
  if tarinfo.isreg():
    fobj = open(filename)
  else:
    fobj = None
  tarinfo.uid = 0
  tarinfo.gid = 0
  tar.addfile(tarinfo, fobj)
tar.close()

I am not against adding a new option. Although, it's too bad that I
added the exclude option in 2.6 not long ago, which does something very
similar and would again be replaced by this more general "include"
option. BTW, I think calling the option "filter" would be more suitable
for what it's doing.
msg92363 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-09-07 12:25
> TarInfo does not need set_uid() or set_gid() methods, 
> both can be set using the uid and gid attributes.

I was thinking about the set_ methods to be able to use
"root" (str) instead of "0" (int) for example, like 
what the tar command seems to allow with --uid and --gid.


> I am not against adding a new option. Although, it's too bad that 
> I added the exclude option in 2.6 not long ago, which does 
> something very similar and would again be replaced by this 
> more general "include" option. BTW, I think calling 
> the option "filter" would be more suitable for what it's doing.

Maybe we could add the "filter" option for 2.7/3.2 together with the
exclude option? And add a deprecation warning for "exclude" when it's
used, since it would then become *one* use case for "filter".

We could also add an exclude callable in the module, as an example
usage of the filter option, exactly like I did for shutil.copytree
(look for ignore_patterns).
msg92373 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-09-07 16:25
I do not quite see the benefit from the set_* methods. Although the
attribute access I proposed may be slightly more complicated (because
you might need the pwd and grp modules) it offers the most freedom.
Let's take the set_uid() method as an example: Its purpose would be to
set both the uid and uname field in the tar header. That is fine as long
as its argument is a uid or username that actually exists. If set_uid()
gets a username that does not exist, what are we going to do? Only set
the uname field and leave the uid field alone or raise an exception? If
the user wants to set a non-existant username on purpose, he cannot use
the set_uid() method. And what are we going to do on Windows? Is there
anything comparable to pwd/grp we could use?
I expect the common use case for these both methods will be to *reset*
the owner information to a default, and this is done by setting uname to
"root" and uid to 0.

The filter argument is actually a nice idea. I have attached a patch
that outlines my idea of how it is supposed to be. Comments welcome.
msg92375 - (view) Author: Tarek Ziadé (tarek) * (Python committer) Date: 2009-09-07 16:43
> I do not quite see the benefit from the set_* methods. 
> .. some explanations of the underlying complexity...

The only benefit I can see for the set_* method is to hide 
the underlying complexity you've explained.

In Distutils, I'd like to provide a uid and gid option
to the sdist command where the user can set "root" for instance
and see the lib taking care of creating a tarfile with everything
set to the right value (and ignore the flags under windows etc)

So it seems that working per TarInfo is the wrong approach, 
and a global function to create an archive would be better.


> The filter argument is actually a nice idea. I have attached a 
> patch that outlines my idea of how it is supposed to be. 
> Comments welcome.

The patch looks nice to me

small typo in the doc :

> How create

should be "How to create"
msg92539 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2009-09-12 10:52
I applied the patch with some more small fixes to the trunk (r74750) and
the py3k branch (r74751).
History
Date User Action Args
2009-09-12 10:52:29lars.gustaebelsetstatus: open -> closed
resolution: accepted
messages: + msg92539
2009-09-07 16:43:10tareksetmessages: + msg92375
2009-09-07 16:25:43lars.gustaebelsetfiles: + issue6856.diff
keywords: + patch
messages: + msg92373
2009-09-07 12:25:23tareksetmessages: + msg92363
2009-09-07 11:52:24lars.gustaebelsetmessages: + msg92361
2009-09-07 08:20:30loewissetassignee: lars.gustaebel
2009-09-07 07:48:21tareksettitle: allow settong uid and gid when creating tar files -> allow setting uid and gid when creating tar files
2009-09-07 07:48:07tareklinkissue6516 dependencies
2009-09-07 07:47:29tarekcreate