Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zipfile.extractall does not preserve file permissions #59999

Open
uruz mannequin opened this issue Aug 27, 2012 · 25 comments
Open

Zipfile.extractall does not preserve file permissions #59999

uruz mannequin opened this issue Aug 27, 2012 · 25 comments
Assignees
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@uruz
Copy link
Mannequin

uruz mannequin commented Aug 27, 2012

BPO 15795
Nosy @ronaldoussoren, @abalkin, @gustaebel, @pitrou, @kmtracey, @merwok, @bitdancer, @PCManticore, @serhiy-storchaka, @mikofski, @nanjekyejoannah, @woodruffw, @Makishima, @selimb, @dignissimus
PRs
  • bpo-15795: Preserve permissions on UNIX (based/like) systems #17790
  • gh-59999: Add option to preserve permissions in ZipFile.extract #32289
  • Files
  • issue15795.patch
  • issue15795_updated.patch
  • issue15795_cleaned.patch
  • issue15795_test_and_doc_fixes.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = None
    created_at = <Date 2012-08-27.22:36:03.648>
    labels = ['type-feature', 'library', '3.11']
    title = 'Zipfile.extractall does not preserve file permissions'
    updated_at = <Date 2022-04-04.14:55:21.787>
    user = 'https://bugs.python.org/uruz'

    bugs.python.org fields:

    activity = <Date 2022-04-04.14:55:21.787>
    actor = 'eric.araujo'
    assignee = 'serhiy.storchaka'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2012-08-27.22:36:03.648>
    creator = 'uruz'
    dependencies = []
    files = ['27086', '34852', '34873', '34893']
    hgrepos = []
    issue_num = 15795
    keywords = ['patch']
    message_count = 24.0
    messages = ['169233', '169242', '169629', '174356', '174359', '191425', '191459', '216228', '216267', '216298', '216304', '216376', '225526', '233445', '237139', '261145', '283918', '311328', '378712', '394819', '404877', '416661', '416672', '416674']
    nosy_count = 24.0
    nosy_names = ['ronaldoussoren', 'belopolsky', 'lars.gustaebel', 'pitrou', 'techtonik', 'swarren', 'kmtracey', 'eric.araujo', 'cbrannon', 'r.david.murray', 'Claudiu.Popa', 'serhiy.storchaka', 'uruz', 'Aaron.Train', 'desaintmartin', 'bwanamarko', 'Orborde', 'Glenn.Jones', '\xc3\x89tienne Dupuis', 'nanjekyejoannah', 'yossarian', 'kulakov-n', 'selimb', 'sam_ezeh']
    pr_nums = ['17790', '32289']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue15795'
    versions = ['Python 3.11']

    @uruz
    Copy link
    Mannequin Author

    uruz mannequin commented Aug 27, 2012

    Zipfile._extract_member does not preserve file permissions while extracting it. As may be seen at link[1], raw open() is used and no os.chmod() applied after that, therefore any permission data stored in zipfile is dropped and file is created with default permission depending on current user's umask.

    [1] http://hg.python.org/cpython/file/52159aa5d401/Lib/zipfile.py#l1251

    @uruz uruz mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Aug 27, 2012
    @bitdancer
    Copy link
    Member

    Does this have any relationship to bpo-3394? From the discussion on that issue it sounds like zipfile is doing things with external_attributes if it is set. But I don't know much about zipfile internals.

    @uruz
    Copy link
    Mannequin Author

    uruz mannequin commented Sep 1, 2012

    I'm attaching a patch, which solves the issue. Patch intoduces new argument "preserve_permissions" for extract and extractall methods. That argument may accept one of the three values: do not preserve permissions, preserve a safe subset of them or preserve all permissions. Three constants introduced for these options. Patch also contains test and docs. Tests pass for me on OS X 10.5, but I'm not sure if they'll pass on other operating systems.

    @AaronTrain
    Copy link
    Mannequin

    AaronTrain mannequin commented Oct 31, 2012

    Thanks for the patch. Is this going to be resolved soon?

    @bitdancer
    Copy link
    Member

    OK, so this is an enhancement to specifically allow preservation of "unsafe" permissions?

    Adding the nosy list from bpo-3394.

    @bitdancer bitdancer added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Oct 31, 2012
    @techtonik
    Copy link
    Mannequin

    techtonik mannequin commented Jun 18, 2013

    There should be an easy way to restore file attributes.

    @ronaldoussoren
    Copy link
    Contributor

    On first glance the patch looks good. I haven't tested it with the current trunk though.

    @GlennJones
    Copy link
    Mannequin

    GlennJones mannequin commented Apr 14, 2014

    Here is an updated patch that applies cleanly to head. Tests pass against head of repo.

    @bitdancer
    Copy link
    Member

    Thanks.

    The patch contains a number of lines that are not wrapped to <80, which is one of our requirements. It would be great to get that fixed. (In the documentation, you can use \ to wrap the prototype line.)

    There is non-ascii in one place in the documentation...probably an em-dash, which is written in sphinx as '---'.

    Also, please remove the news entry from the patch, it will just make it harder to apply (and is in the wrong place anyway...we add entries at the top of the appropriate section, not the bottom).

    @GlennJones
    Copy link
    Mannequin

    GlennJones mannequin commented Apr 15, 2014

    Patch cleaned up based on previous comments.

    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Apr 15, 2014

    Hello. I added a couple of comments to your latest patch.

    @GlennJones
    Copy link
    Mannequin

    GlennJones mannequin commented Apr 15, 2014

    Patch with docs and tests fixed

    @serhiy-storchaka
    Copy link
    Member

    The TarFile.extract() method has the set_attrs parameter which controls similar behavior but is less flexible. It would be good to unify zipfile and tarfile abilities. set_attrs also controls setting file owner and time. When we restore unsafe uid/gid/sticky bits, it would be worth to restore also file owner. And it would be not bad to restore file time.

    @serhiy-storchaka serhiy-storchaka self-assigned this Aug 19, 2014
    @arobert
    Copy link
    Mannequin

    arobert mannequin commented Jan 5, 2015

    I hope this can be finally gotten in for 3.5, even if it's not the perfect solution. I hit this issue and needed to call out to a subprocess as a work-around, but that's far less reliable.

    @uruz
    Copy link
    Mannequin Author

    uruz mannequin commented Mar 3, 2015

    I'm working on updating the patch to unify tarfile and zipfile interfaces and to restore owner/timestamp for zipfile

    @mikofski
    Copy link
    Mannequin

    mikofski mannequin commented Mar 3, 2016

    same problem in 2.7.5 on Oracle Linux 7.2

    @kmtracey
    Copy link
    Mannequin

    kmtracey mannequin commented Dec 24, 2016

    Note the zipfile being processed may have been created on a non-Unix system, and the external_attr value can't be usefully interpreted as permission bits when the value at _CD_CREATE_SYSTEM (https://hg.python.org/cpython/file/default/Lib/zipfile.py#l107) in the central directory entry is not Unix (3). Patches so far don't seem to guard against mistakenly assuming a foreign-system external_attr value can be interpreted as Unix permission bits.

    (At present github is delivering zipfiles of repos with a create system value of 0 and external_attr values of 0.)

    @tienneDupuis
    Copy link
    Mannequin

    tienneDupuis mannequin commented Jan 31, 2018

    @Makishima
    Copy link
    Mannequin

    Makishima mannequin commented Oct 16, 2020

    Is there any chance that the pull request will be accepted? I'm a bit tired of using workaround every time I need unzip something on linux.

    @merwok
    Copy link
    Member

    merwok commented May 31, 2021

    The pull request needs unit tests added to validate the changes.

    Note that the patch attached here was a new feature, adding constants and parameters to control the behaviour, but the PR simply checks and applies permissions bits stored in the entry. That seems correct and nice to me, and arguably a bugfix that should be backported, but more active core devs and/or zipfile experts should weigh in.

    @merwok merwok added the 3.10 only security fixes label May 31, 2021
    @nanjekyejoannah
    Copy link
    Member

    I left a review on the PR requesting for some tests, if it makes sense.

    @merwok
    Copy link
    Member

    merwok commented Apr 4, 2022

    The new PR uses new constants*, so could not be backported as is (see my previous message).

    (*side question: should the constants use an enum?)

    @merwok merwok added 3.11 only security fixes and removed 3.10 only security fixes labels Apr 4, 2022
    @dignissimus
    Copy link
    Mannequin

    dignissimus mannequin commented Apr 4, 2022

    I don't know what the best course of action would be but if preserving permissions needs to be back-ported, could the default permission preservation flag in 3.11+ be the one to preserve safe permissions and then make it so that the previous versions (<3.11, without the constants) always take this course of action? Maintaining the different options for preserving permissions while still allowing for this functionality to be back-ported.

    I don't have a strong opinion on backporting permission preservation but to me, it seems like it would be a change in behaviour instead of a bug fix. The current default in the PR is to not preserve any permissions but if necessary, I'll change it to whatever is agreed upon.

    I'll move the constants into an enum, but right now I'm not sure how I'd name the class.

    As an aside, while writing this comment I realised that the reason tests aren't passing on my end might very well be due to the fact I do CPython work on an NTFS partition instead of on my main EXT4 partition.

    @merwok
    Copy link
    Member

    merwok commented Apr 4, 2022

    On second thought, maybe no fix should be backported.
    Changing behaviour silently might break things, changing with a new option puts the change in a grey fix/enhancement area, so maybe better to let current versions as they are, with the bug/limitation noted in the documentation, and give control in 3.11 only.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @dignissimus
    Copy link
    Contributor

    dignissimus commented Apr 12, 2022

    After scratching my head for a while, I was able to get tests to pass. There's this small quirk where it's not possible to create a file with mode 0 using ZipFile.writestr if you don't apply any other masks. It's due to these lines.

    cpython/Lib/zipfile.py

    Lines 1628 to 1629 in ac6c3de

    if not zinfo.external_attr:
    zinfo.external_attr = 0o600 << 16 # permissions: ?rw-------

    This happens even though it's possible to create such a file (although the file will have the 0o100000 regular file mask).

    Python 3.11.0a6+ (heads/bpo-15795:8a056b4fc2, Apr 13 2022, 00:11:18) [GCC 11.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pathlib
    >>> import os
    >>> pathlib.Path("file").touch(mode=0)
    >>> os.stat("file").st_mode >> 16
    0
    >>> 

    Since I was setting external_attr, I should have included the mask for the file type. So I don't think this behaviour is a problem.

    AstrobioMike added a commit to AstrobioMike/genelab-utils that referenced this issue May 24, 2023
    - to use system zip/unzip because python module zipfile cannont extract and retain permissions (python/cpython#59999) and this was messing with things expected to be executable
    stanislavlevin added a commit to stanislavlevin/pyproject_installer that referenced this issue Apr 9, 2024
    Today's Python packaging specification is not clear about making
    binaries executable on installation from data/scripts/ directory:
    https://packaging.python.org/en/latest/specifications/binary-distribution-format/
    
    > In wheel, scripts are packaged in
      {distribution}-{version}.data/scripts/. If the first line of a file in
      scripts/ starts with exactly b'#!python', rewrite to point to the
      correct interpreter. Unix installers may need to add the +x bit to these
      files if the archive was created on Windows.
    
    ZipFile in turn, doesn't preserve file permissions in Python:
    python/cpython#59999
    
    This leads to situation when freshly installed binaries can't be
    executed.
    
    Fixes: #66
    Signed-off-by: Stanislav Levin <slev@altlinux.org>
    stanislavlevin added a commit to stanislavlevin/pyproject_installer that referenced this issue Apr 9, 2024
    Today's Python packaging specification is not clear about making
    binaries executable on installation from data/scripts/ directory:
    https://packaging.python.org/en/latest/specifications/binary-distribution-format/
    
    > In wheel, scripts are packaged in
      {distribution}-{version}.data/scripts/. If the first line of a file in
      scripts/ starts with exactly b'#!python', rewrite to point to the
      correct interpreter. Unix installers may need to add the +x bit to these
      files if the archive was created on Windows.
    
    ZipFile in turn, doesn't preserve file permissions in Python:
    python/cpython#59999
    
    This leads to situation when freshly installed binaries can't be
    executed.
    
    Fixes: #66
    Signed-off-by: Stanislav Levin <slev@altlinux.org>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Status: No status
    Development

    No branches or pull requests

    6 participants