Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TarFile.getmember on directory requires trailing slash iff over 100 chars #66186

Closed
moloney mannequin opened this issue Jul 16, 2014 · 15 comments
Closed

TarFile.getmember on directory requires trailing slash iff over 100 chars #66186

moloney mannequin opened this issue Jul 16, 2014 · 15 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@moloney
Copy link
Mannequin

moloney mannequin commented Jul 16, 2014

BPO 21987
Nosy @gustaebel, @vstinner, @bitdancer, @serhiy-storchaka, @miss-islington, @afonari, @akulakov
PRs
  • bpo-21987: fix TarFile.getmember getting a dir with a trailing slash #30283
  • [3.10] bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) #30737
  • [3.9] bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) #30738
  • Files
  • tarfile_issue.py
  • issue21987.diff
  • issue21987_py3.5_with_test.patch
  • issue21987_py2.7_with_test.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-01-23.17:54:59.291>
    created_at = <Date 2014-07-16.03:40:59.246>
    labels = ['type-bug', 'library', '3.9', '3.10', '3.11']
    title = 'TarFile.getmember on directory requires trailing slash iff over 100 chars'
    updated_at = <Date 2022-01-23.17:54:59.290>
    user = 'https://bugs.python.org/moloney'

    bugs.python.org fields:

    activity = <Date 2022-01-23.17:54:59.290>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-01-23.17:54:59.291>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2014-07-16.03:40:59.246>
    creator = 'moloney'
    dependencies = []
    files = ['35976', '36045', '36096', '36202']
    hgrepos = []
    issue_num = 21987
    keywords = ['patch']
    message_count = 15.0
    messages = ['223167', '223174', '223264', '223432', '223732', '224014', '224542', '348618', '376370', '376373', '409261', '409265', '411089', '411092', '411391']
    nosy_count = 10.0
    nosy_names = ['lars.gustaebel', 'vstinner', 'r.david.murray', 'serhiy.storchaka', 'puppet', 'moloney', 'zigg', 'miss-islington', 'af', 'andrei.avk']
    pr_nums = ['30283', '30737', '30738']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21987'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @moloney
    Copy link
    Mannequin Author

    moloney mannequin commented Jul 16, 2014

    If a directory path is under 100 char you have to omit the trailing slash from the name passed to 'getmember'. If it is over 100 you have to include the trailing slash.

    As a work around I can use the private '_getmember' with 'normalize=True'.

    I tested on 2.7.2 and searched the release notes looking for a related fix since then. I couldn't find anything there, or here in the issue tracker.

    @moloney moloney mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jul 16, 2014
    @serhiy-storchaka
    Copy link
    Member

    Could you please provide an example?

    @moloney
    Copy link
    Mannequin Author

    moloney mannequin commented Jul 16, 2014

    Here is a script illustrating the issue.

    @bitdancer
    Copy link
    Member

    There is indeed special logic that triggers if the name is longer than 100 characters. Presumably it has a bug. Marking this as easy since it shouldn't be too hard, given the failure example, to figure out what is wrong and fix it (and turn the example into a unit test).

    It doesn't look like the relevant code has changed in python3, so the bug probably exists there as well.

    @bitdancer bitdancer added the easy label Jul 18, 2014
    @gustaebel
    Copy link
    Mannequin

    gustaebel mannequin commented Jul 23, 2014

    Apparently, the problem is located in TarInfo._proc_gnulong(). I attached a patch.

    When tarfile reads an archive, it strips trailing slashes from all filenames, except GNUTYPE_LONGNAME headers, which is a bug. tarfile creates GNU_FORMAT tar files by default, hence it uses an additional GNUTYPE_LONGNAME header for filenames >100 chars. That's why tarfile_issue.py fails if used with PAX_FORMAT, because PAX_FORMAT doesn't have this bug.

    @zigg
    Copy link
    Mannequin

    zigg mannequin commented Jul 26, 2014

    Here is a 3.5 fix based on Lars Gustäbel's, with test.

    @puppet
    Copy link
    Mannequin

    puppet mannequin commented Aug 2, 2014

    Added Matt Behrens test to Lars Gustäbel 2.7 version.

    @vstinner
    Copy link
    Member

    This issue is 5 years old has 4 patches: it's far from being "newcomer friendly", I remove the "Easy" label.

    @vstinner vstinner removed the easy label Jul 29, 2019
    @afonari
    Copy link
    Mannequin

    afonari mannequin commented Sep 4, 2020

    Any updates on this?

    @vstinner
    Copy link
    Member

    vstinner commented Sep 4, 2020

    Any updates on this?

    So far, nobody proposed a pull request. So no, there is no update.

    Someone has to step in, dig into the issue, propose a fix, then someone else has to review the PR, and finally the PR should be merged.

    @akulakov
    Copy link
    Contributor

    The original issue was twofold:

    1. below 100 char not working with trailing slash
    2. over 100 char not working WITHOUT trailing slash

    The second part is no longer an issue -- tested in 3.9 and 3.11 on MacOS.

    Currently the issue is that a trailing slash now doesn't work for lookup of dirs, no matter the size of name.

    This is inconsistent with the way shell commands work as well as various Python path related modules that tolerate trailing slash for dirs.

    This can cause users to wrongly assume a dir is absent in a tarfile, so I think it's worth fixing and I've added a PR with a test for both old and new issue.

    @serhiy-storchaka
    Copy link
    Member

    Well, the tar command strips trailing slashes (even from file paths), so it is reasonable to do this in getmember().

    $ mkdir dir
    $ touch dir/file
    $ tar cf archive.tar dir
    $ tar tf archive.tar dir
    dir/
    dir/file
    $ tar tf archive.tar dir/
    dir/
    dir/file
    $ tar tf archive.tar dir/file
    dir/file
    $ tar tf archive.tar dir/file/
    dir/file
    $ tar tf archive.tar dir/file////
    dir/file

    @serhiy-storchaka
    Copy link
    Member

    New changeset cfadcc3 by andrei kulakov in branch 'main':
    bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283)
    cfadcc3

    @miss-islington
    Copy link
    Contributor

    New changeset 1d11fdd by Miss Islington (bot) in branch '3.10':
    bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283)
    1d11fdd

    @serhiy-storchaka
    Copy link
    Member

    New changeset 94d6434 by Miss Islington (bot) in branch '3.9':
    [3.9] bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) (GH-30738)
    94d6434

    @serhiy-storchaka serhiy-storchaka added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes labels Jan 23, 2022
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants