Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit test needed for IDLE's GrepDialog.py's findfiles() #67394

Closed
AlSweigart mannequin opened this issue Jan 9, 2015 · 10 comments
Closed

Unit test needed for IDLE's GrepDialog.py's findfiles() #67394

AlSweigart mannequin opened this issue Jan 9, 2015 · 10 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes topic-IDLE type-feature A feature request or enhancement

Comments

@AlSweigart
Copy link
Mannequin

AlSweigart mannequin commented Jan 9, 2015

BPO 23205
Nosy @terryjreedy, @csabella, @miss-islington
PRs
  • bpo-23205: IDLE: Add tests and refactor grep's findfiles #12203
  • [3.7] bpo-23205: IDLE: Add tests and refactor grep's findfiles (GH-12203) #12512
  • Files
  • idle_grep_findfiles_test.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/csabella'
    closed_at = <Date 2019-03-23.12:23:39.075>
    created_at = <Date 2015-01-09.08:16:00.708>
    labels = ['3.8', 'expert-IDLE', 'type-feature', '3.7']
    title = "Unit test needed for IDLE's GrepDialog.py's findfiles()"
    updated_at = <Date 2019-03-23.17:51:13.070>
    user = 'https://bugs.python.org/AlSweigart'

    bugs.python.org fields:

    activity = <Date 2019-03-23.17:51:13.070>
    actor = 'cheryl.sabella'
    assignee = 'cheryl.sabella'
    closed = True
    closed_date = <Date 2019-03-23.12:23:39.075>
    closer = 'cheryl.sabella'
    components = ['IDLE']
    creation = <Date 2015-01-09.08:16:00.708>
    creator = 'Al.Sweigart'
    dependencies = []
    files = ['37652']
    hgrepos = []
    issue_num = 23205
    keywords = ['patch']
    message_count = 10.0
    messages = ['233727', '233810', '233834', '336437', '336482', '337339', '338653', '338668', '338674', '338691']
    nosy_count = 4.0
    nosy_names = ['terry.reedy', 'Al.Sweigart', 'cheryl.sabella', 'miss-islington']
    pr_nums = ['12203', '12512']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue23205'
    versions = ['Python 3.7', 'Python 3.8']

    @AlSweigart
    Copy link
    Mannequin Author

    AlSweigart mannequin commented Jan 9, 2015

    GrepDialog.py's findfiles() method lacks a unit test.

    The comments in the unit test stub in test_grep.py correctly notes that since findfiles() method does not rely on the GrepDialog class, it can be made into a function.

    The attached patch does the following:

    • Moves findfiles() to be a function in the GrepDialog.py module.
    • Adds a unit test for findfiles()
    • Adds sensible default arguments
    • As suggested by the previous stub comments, findfiles() returns a generator instead of a full list
    • Changes 'list' and 'dir' names since those are built-ins
    • Uses os.walk() instead of reinventing it.

    There were so many changes to make that I eventually just rewrote the entire findfiles function. I've checked that the new version returns the same strings as the old version.

    @AlSweigart AlSweigart mannequin added the topic-IDLE label Jan 9, 2015
    @terryjreedy
    Copy link
    Member

    I have been putting off re-writing findfiles because it partly duplicates os.listdir, which should perhaps be used instead, except that a new, improved, os.scandir is in the works: PEP-471 and bpo-22524.

    I believe filefiles currently searches depth first, whereas I would generally prefer breadth first. For instance, I would like all hits in /lib together and then hits in /lib/idlelib, /lib/test, /lib/tinter, etc. What do you think?

    @terryjreedy terryjreedy added the type-feature A feature request or enhancement label Jan 10, 2015
    @AlSweigart
    Copy link
    Mannequin Author

    AlSweigart mannequin commented Jan 11, 2015

    I checked with a couple grep programs and they use depth first. Which makes sense, since you'd want the return order to be something like:

    /a/spam.txt
    /a/aa/spam.txt
    /a/bb/spam.txt
    /x/spam.txt
    /y/spam.txt
    /z/spam.txt

    ...instead of the bread-first:

    /a/spam.txt
    /x/spam.txt
    /y/spam.txt
    /z/spam.txt
    /a/aa/spam.txt
    /a/bb/spam.txt

    However, it turns out this is moot since the first thing GrepDialog.py does with the returned results is sort them.

    @terryjreedy
    Copy link
    Member

    Findfiles was more like os.walk than os.listdir. os.walk now uses os.scandir instead of os.listdir. From 3.7.2 doc:

    "By default, errors from the scandir() call are ignored. If optional argument onerror is specified, it should be a function; it will be called with one argument, an OSError instance. It can report the error to continue with the walk, or raise the exception to abort the walk. Note that the filename is available as the filename attribute of the exception object."

    We should delete the try: except: and pass in a function to print the name and error message. This could be done after getting a working patch.

    Defaults are not needed as findfiles is called with 3 positional args. The calling method should do the replacement of '' with os.curdir().

    The patch must be incomplete as generators do not have a sort method. Replace the first 3 lines of grep_it with

        folder, filepat = os.path.split(path)
        if not folder:
            folder = os.curdir()
        filelist = sorted(findfiles(folder, filepat, self.recvar.get())
    

    and replace 'list' with 'filelist' in the following code.

    Cheryl, grab the issue if you want to do this.

    @terryjreedy terryjreedy added 3.7 (EOL) end of life 3.8 only security fixes labels Feb 24, 2019
    @csabella
    Copy link
    Contributor

    Sure, thanks. grep was on list to add tests to, so I'll take a look at this as well. Thanks!

    @csabella csabella self-assigned this Feb 24, 2019
    @csabella
    Copy link
    Contributor

    csabella commented Mar 6, 2019

    I've opened a PR with the changes. I did several commits, one for each stage. That is, the first one adds the test, then the second one moves findfiles to the module level, etc.

    I added the onerror function because if a directory that doesn't exist is entered as the path, it's nice to see that message at the top of the output window instead of just seeing zero hits. Plus, I had made a test for it. :-)

    One change that I didn't commit is that an alternative version of findfiles would be:

    def findfiles(folder, pattern, recursive):
        prefix = '**/' if recursive else ''
        yield from(pathlib.Path(folder).glob(f'{prefix}{pattern}'))

    The tests would have to be reworked, but manual testing showed it gave the same results, albeit without the onerror.

    One other comment about the sorting. If you change the sorted() in grep_it() to list() when you're looking at manual results, you'll see that list() shows all of one directory first, then all of the first child directory, etc (which makes sense since walk does each directory at a time). It's a quick way to compare the depth-first vs breadth first results.

    @terryjreedy
    Copy link
    Member

    Commit as-is. I will consider the Path.glob findfiles while working on bpo-36323. Like walk, it yields in breadth first order. Unlike walk, it requires relative paths (which bpo-37323 may need anyway) and needs code to deal with that.

    I like the breadth first listing from glob. Would it make any sense to give users a choice? What does unix grep do?

    @csabella
    Copy link
    Contributor

    New changeset d60f658 by Cheryl Sabella in branch 'master':
    bpo-23205: IDLE: Add tests and refactor grep's findfiles (GH-12203)
    d60f658

    @miss-islington
    Copy link
    Contributor

    New changeset 5ab6650 by Miss Islington (bot) in branch '3.7':
    bpo-23205: IDLE: Add tests and refactor grep's findfiles (GH-12203)
    5ab6650

    @csabella
    Copy link
    Contributor

    On linux, grep does depth first, so searching for 'idle' from Lib.idlelib returns:

    --- cut ---
    help.py
    history.py
    idle.py
    all of idle_test/
    __init__.py
    iomenu.py
    --- cut ---

    Although, within idle_test, the files aren't in alphabetical order. Also, as you can see __init__ is before iomenu, so underscores seem to be ignored. On SO, it looks like they suggest piping it to sort if one wants a given ordering.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes topic-IDLE type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants