Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add filter to zipapp #75255

Closed
irmen mannequin opened this issue Jul 28, 2017 · 15 comments
Closed

add filter to zipapp #75255

irmen mannequin opened this issue Jul 28, 2017 · 15 comments
Assignees
Labels
3.7 (EOL) end of life easy stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@irmen
Copy link
Mannequin

irmen mannequin commented Jul 28, 2017

BPO 31072
Nosy @brettcannon, @pfmoore, @irmen, @serhiy-storchaka
PRs
  • bpo-31072: Add filter to zipapp #3021
  • bpo-31072: Rename the new filter argument for zipapp.create_archive. #3049
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/pfmoore'
    closed_at = <Date 2017-08-09.13:39:28.760>
    created_at = <Date 2017-07-28.16:50:52.044>
    labels = ['3.7', 'easy', 'type-feature', 'library']
    title = 'add filter to zipapp'
    updated_at = <Date 2017-08-26.17:04:14.279>
    user = 'https://github.com/irmen'

    bugs.python.org fields:

    activity = <Date 2017-08-26.17:04:14.279>
    actor = 'paul.moore'
    assignee = 'paul.moore'
    closed = True
    closed_date = <Date 2017-08-09.13:39:28.760>
    closer = 'paul.moore'
    components = ['Library (Lib)']
    creation = <Date 2017-07-28.16:50:52.044>
    creator = 'irmen'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31072
    keywords = ['easy']
    message_count = 15.0
    messages = ['299409', '299410', '299417', '299423', '300004', '300006', '300025', '300028', '300031', '300157', '300160', '300179', '300180', '300870', '300886']
    nosy_count = 4.0
    nosy_names = ['brett.cannon', 'paul.moore', 'irmen', 'serhiy.storchaka']
    pr_nums = ['3021', '3049']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue31072'
    versions = ['Python 3.7']

    @irmen
    Copy link
    Mannequin Author

    irmen mannequin commented Jul 28, 2017

    As briefly discussed on comp.lang.python, I propose to add an optional filter callback function to zipapp.create_archive.

    The function could perhaps work like the os.walk generator or maybe just lets you to return a simple boolean for every folder/file that it wants to include in the zip.

    My use case is that I sometimes don't want to include every file in the root folder into the zip file (I want to be able to skip temporary or irrelevant folders such as .git/.svn, .tox, .tmp and sometimes want to avoid including *.pyc/*.pyo files). Right now, I first have to manually clean up the folder before I can use zipapp.create_archive.

    (Instead of providing a filter callback fuction, another approach may be to provide your own dir/file generator instead, that fully replaces the internal file listing logic of zipapp.create_archive?)

    @irmen irmen mannequin assigned pfmoore Jul 28, 2017
    @irmen irmen mannequin added stdlib Python modules in the Lib dir easy type-feature A feature request or enhancement labels Jul 28, 2017
    @pfmoore
    Copy link
    Member

    pfmoore commented Jul 28, 2017

    I'd propose an extra argument to zipapp.create_archive, include_file=None (feel free to bikeshed on the name). If the argument is not None, then it should be a callable which will be called with a pathlib.Path object for each file that's selected for inclusion in the archive. The function should return a boolean - False means don't include this file.

    Because the create_archive function only gets a list of files internally (it uses Path.rglob()), the callable won't get passed directories, only the actual files (but it can of course check the full path to see what directory the file is in).

    The include_file argument is ignored when copying anything other than a filesystem directory (i.e., when the source argument is a filename or an open file object).

    @irmen
    Copy link
    Mannequin Author

    irmen mannequin commented Jul 28, 2017

    That sounds fine to me. I guess the paths passed to the function should be relative to the root folder being zipped?

    @pfmoore
    Copy link
    Member

    pfmoore commented Jul 28, 2017

    Yes, they can be.

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 9, 2017

    New changeset b811d66 by Paul Moore (Jeffrey Rackauckas) in branch 'master':
    bpo-31072: Add filter to zipapp (bpo-3021)
    b811d66

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 9, 2017

    Thanks to Jeffrey Rackauckas for the implementation of this feature.

    @pfmoore pfmoore closed this as completed Aug 9, 2017
    @serhiy-storchaka
    Copy link
    Member

    Wouldn't be better to name the parameter filterfunc for conformity with PyZipFile?

    I think the new feature needs at least the versionadded directive in the module documentation. And may be an entry in the What's New document.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Aug 9, 2017
    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 9, 2017

    Good point - I wasn't even aware of the filterfunc argument in PyZipFile. I'll rename the argument.

    I wasn't initially sure about a what's new entry. I'll add one - and thanks for the reminder about versionadded.

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 9, 2017

    I've created a new PR 3049 adding the fixes you suggested (and tightening up the tests, as I noticed an untested aspect of the change while editing).

    @serhiy-storchaka
    Copy link
    Member

    There are two differences between filterfunc arguments in zipapp and PyZipFile:

    1. An argument of filterfunc is a string in PyZipFile and a Path object in zipapp.

    2. The patch in zipapp is relative to the root of the archive. The patch in PyZipFile is a path on FS (absolute or relative to the current working directory).

    I afraid that these differences can cause confusions.

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 11, 2017

    Zipapp uses path objects throughout, so making the filter function take a path object is consistent with that. I guess modifying PyZipFile to take either a string or a path object would be possible.

    As for the relative path, that's deliberate as I expect that a common use case will be to exclude a directory, and doing that via

    lambda pth: pth.parts[0] != 'dir_to_exclude'
    

    is a simple possibility. Having to do this with an absolute path would just require the user to make the path relative, and we've already done that in zipapp so why duplicate the work?

    So I guess I'm saying I want to keep both those choices. Do you think this is a sufficient problem that we should not use the same name? Any suggestions on a better name (or should we stick with the original include_file)?

    @brettcannon
    Copy link
    Member

    What about simply 'filter' as a name? or 'path_filter'?

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 11, 2017

    Sounds reasonable :-) I'm not going to be checking mails for a week or so, so I'll revisit this once I get back.

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 26, 2017

    OK. There's been no further comments, and I think the differences with PyZipFile's filterfunc are sufficient to warrant using a different name. I'm going to go with "filter". It's short, and says what it means.

    @pfmoore
    Copy link
    Member

    pfmoore commented Aug 26, 2017

    New changeset 0780bf7 by Paul Moore in branch 'master':
    bpo-31072: Rename the new filter argument for zipapp.create_archive. (bpo-3049)
    0780bf7

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life easy stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants