Issue34417
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-08-16 21:51 by Phillip.M.Feldman@gmail.com, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (5) | |||
---|---|---|---|
msg323623 - (view) | Author: Phillip M. Feldman (Phillip.M.Feldman@gmail.com) | Date: 2018-08-16 21:51 | |
`imp.find_module` goes down in flames if one tries to pass an iterator rather than a list of folders. Firstly, the message that it produces is somewhat misleading: RuntimeError: sys.path must be a list of directory names Secondly, it would be helpful if one could pass an iterator. I'm thinking in particular of the situation where one wants to import something from a large folder tree, and the module in question is likely to be found early in the search process, so that it is more efficient to explore the folder tree incrementally. |
|||
msg323660 - (view) | Author: Eric Snow (eric.snow) * | Date: 2018-08-17 16:05 | |
There are several issues at hand here, Phillip. I'll enumerate them below. Thanks for taking the time to let us know about this. However, I'm closing this issue since realistically the behavior of imp.find_module() isn't going to change, particularly in Python 2.7. Even though the issue is closed, feel free to reply, particularly about how you are using imp.find_module() (we may be able to point you toward how to use importlib instead). Also, I've changed this issue's type to "enhancement". imp.find_module() is working as designed, so what you are looking for is a feature request. Consequently there's a much higher bar for justifying a change. Here are reasons why the requested change doesn't reach that bar: 1. Python 2.7 is closed to new features. So imp.find_module() is not going to change. 2. Python 2.7 is nearing EOL. We highly recommend that everyone move to Python 3 as soon as possible. Hopefully you are in a position to do so. If you're stuck on Python 2.7 then you miss the advantages of importlib, along with a ton of other benefits. If you are not going to be able to migrate before 2020 then send an email to python-list@python.org asking for recommendations on what to do. 3. Starting in Python 3.4, using the imp module is discouraged/deprecated. "Deprecated since version 3.4: The imp package is pending deprecation in favor of importlib." [1] The importlib package should have everything you need. What are you using imp.find_module() for? We should be able to demonstrate the equivalent using importlib. 4. The import machinery is designed around using a list (the builtin type, not the concept) for the "module search path". * imp.find_module(): "the list of directory names given by sys.path is searched" [2] * imp.find_module(): "Otherwise, path must be a list of directory names" [2] * importlib.find_loader() (deprecated): "optionally within the specified path" (which defaults to sys.path) [3] * importlib.util.find_spec(): doesn't even have a "path" parameter [4] * ModuleSpec.submodule_search_locations: "List of strings for where to find submodules" [5] * sys.path: "A list of strings that specifies the search path for modules. ... Only strings and bytes should be added to sys.path; all other data types are ignored during import." [6] [1] https://docs.python.org/3/library/imp.html#module-imp [2] https://docs.python.org/3/library/imp.html#imp.find_module [3] https://docs.python.org/3/library/importlib.html#importlib.find_loader [4] https://docs.python.org/3/library/importlib.html#importlib.util.find_spec [5] https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations [6] https://docs.python.org/3/library/sys.html#sys.path |
|||
msg323820 - (view) | Author: Phillip M. Feldman (Phillip.M.Feldman@gmail.com) | Date: 2018-08-21 04:51 | |
It appears that the `importlib` package has the same issue: One can't provide an iterator for the path. When searching a large folder tree for an item that is likely to be found early in the search process (i.e., at a high level in the folder tree), the available functionality is massively inefficient. So, I wrote my own wrapper for `imp.find_module` to do this job, and will eventually modify this code to use `importlib` instead of `imp`. On Fri, Aug 17, 2018 at 9:05 AM Eric Snow <report@bugs.python.org> wrote: > > Eric Snow <ericsnowcurrently@gmail.com> added the comment: > > There are several issues at hand here, Phillip. I'll enumerate them below. > > Thanks for taking the time to let us know about this. However, I'm > closing this issue since realistically the behavior of imp.find_module() > isn't going to change, particularly in Python 2.7. Even though the issue > is closed, feel free to reply, particularly about how you are using > imp.find_module() (we may be able to point you toward how to use importlib > instead). > > Also, I've changed this issue's type to "enhancement". imp.find_module() > is working as designed, so what you are looking for is a feature request. > Consequently there's a much higher bar for justifying a change. Here are > reasons why the requested change doesn't reach that bar: > > 1. Python 2.7 is closed to new features. > > So imp.find_module() is not going to change. > > 2. Python 2.7 is nearing EOL. > > We highly recommend that everyone move to Python 3 as soon as possible. > Hopefully you are in a position to do so. If you're stuck on Python 2.7 > then you miss the advantages of importlib, along with a ton of other > benefits. > > If you are not going to be able to migrate before 2020 then send an email > to python-list@python.org asking for recommendations on what to do. > > 3. Starting in Python 3.4, using the imp module is discouraged/deprecated. > > "Deprecated since version 3.4: The imp package is pending deprecation in > favor of importlib." [1] > > The importlib package should have everything you need. What are you using > imp.find_module() for? We should be able to demonstrate the equivalent > using importlib. > > 4. The import machinery is designed around using a list (the builtin type, > not the concept) for the "module search path". > > * imp.find_module(): "the list of directory names given by sys.path is > searched" [2] > * imp.find_module(): "Otherwise, path must be a list of directory names" > [2] > * importlib.find_loader() (deprecated): "optionally within the specified > path" (which defaults to sys.path) [3] > * importlib.util.find_spec(): doesn't even have a "path" parameter [4] > * ModuleSpec.submodule_search_locations: "List of strings for where to > find submodules" [5] > * sys.path: "A list of strings that specifies the search path for modules. > ... Only strings and bytes should be added to sys.path; all other data > types are ignored during import." [6] > > > [1] https://docs.python.org/3/library/imp.html#module-imp > [2] https://docs.python.org/3/library/imp.html#imp.find_module > [3] https://docs.python.org/3/library/importlib.html#importlib.find_loader > [4] > https://docs.python.org/3/library/importlib.html#importlib.util.find_spec > [5] > https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations > [6] https://docs.python.org/3/library/sys.html#sys.path > > ---------- > nosy: +brett.cannon, eric.snow > resolution: -> wont fix > stage: -> resolved > status: open -> closed > type: behavior -> enhancement > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue34417> > _______________________________________ > |
|||
msg323837 - (view) | Author: Brett Cannon (brett.cannon) * | Date: 2018-08-21 17:32 | |
Saying "the available functionality is massively inefficient" is unnecessarily hostile towards those of us who actually wrote and maintain that code. Without diving into the code, chances are that requirement is there so that the C code can use macros to access the list as efficiently as possible. Now if you want to propose specific changes to importlib's code for it to work with iterables instead of just lists then we would be happy to review the pull request. |
|||
msg323838 - (view) | Author: Phillip M. Feldman (Phillip.M.Feldman@gmail.com) | Date: 2018-08-21 18:37 | |
My apologies for the tone of my remark. I am grateful to you and others who donate their time to develop the code. I'm attaching the wrapper code that I created to work around the problem. Phillip def expander(paths='./*'): """ OVERVIEW This function is a generator, i.e., creates an iterator that recursively searches a list of folders in an incremental fashion. This approach is advantageous when the folder tree(s) to be searched are large and the item of interest is likely to be found early in the process. INPUTS `paths` must be either (a) a list of folder paths (each of which is a string) or (b) a single string containing one or more folder paths separated by the OS-specific path delimiter. Each path in `paths` must be either (a) an existing folder or (b) an existing folder followed by '/*' or '\*'. In case (a), the folder string is copied from the input (`paths`) to the output result verbatim. In case (b), the folder string is replaced by an expanded list that includes not only the base (the portion of the path that remains after the '/*' or '\*' has been removed), but all subfolders as well. RETURN VALUES The returned value is an iterator. Invoking the `next` method of the iterator produces one folder path at a time. """ if isinstance(paths, basestring): paths= paths.split(os.pathsep) elif not isinstance(paths, list): raise TypeError("`paths` must be either a string or a list of strings.") found= set() for path in paths: if path.endswith('/*') or path.endswith('\*'): # A recursive search of subfolders is required: for item in os.walk(path[:-2]): base= os.path.abspath(item[0]) new= [os.path.join(base, nested) for nested in item[1]] for item in new: if not item in found: found.add(item) yield item else: # No recursive search is required: if not item in found: found.add(item) yield item # end for path in paths def find_module(module_name, in_folders=[]): """ This function finds a module and return the fully-qualified file name. Folders from `in_folders`, if specified, are search first, followed by folders in the global `import_path` list. If any folder name in `in_folders` or `import_path` ends with an asterisk, indicating that a recursive search is required, `files.expander` is invoked to create iterators that return one folder at a time, and `imp.find_module` is invoked separately for each of these folders. EXPLICIT INPUTS `module_name` is the unqualified name of the module to be found. `in_folders` is an optional list of additional folders to be searched before the folders in `import_path` are searched. IMPLICIT INPUTS `import_path` is obtained from the global namespace. RETURN VALUES If `find_module` is able to find the requested module, it returns the same three return values (`f`, `filename`, and `description`) that `imp.find_module` would return. """ if isinstance(in_folders, basestring): in_folders= [in_folders] elif not isinstance(in_folders, list): raise TypeError("If specified, `in_folders` must be either a string or a " "list of strings. (A string is wrapped to produce a length-1 list).") if any([item.endswith('*') for item in in_folders ]) or \ any([item.endswith('*') for item in import_path]): ex= None for folder in itertools.chain( expander(in_folders), expander(import_path)): try: return imp.find_module(module_name, in_folders + import_path) except Exception as ex: pass if ex: raise ex else: return imp.find_module(module_name, in_folders + import_path) On Tue, Aug 21, 2018 at 10:32 AM Brett Cannon <report@bugs.python.org> wrote: > > Brett Cannon <brett@python.org> added the comment: > > Saying "the available functionality is massively inefficient" is > unnecessarily hostile towards those of us who actually wrote and maintain > that code. Without diving into the code, chances are that requirement is > there so that the C code can use macros to access the list as efficiently > as possible. > > Now if you want to propose specific changes to importlib's code for it to > work with iterables instead of just lists then we would be happy to review > the pull request. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue34417> > _______________________________________ > |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:04 | admin | set | github: 78598 |
2018-08-21 18:37:08 | Phillip.M.Feldman@gmail.com | set | messages: + msg323838 |
2018-08-21 17:32:32 | brett.cannon | set | messages: + msg323837 |
2018-08-21 04:51:38 | Phillip.M.Feldman@gmail.com | set | messages: + msg323820 |
2018-08-17 16:05:35 | eric.snow | set | status: open -> closed type: behavior -> enhancement nosy: + eric.snow, brett.cannon messages: + msg323660 resolution: wont fix stage: resolved |
2018-08-16 21:51:44 | Phillip.M.Feldman@gmail.com | create |