Title: pkgutil.walk_packages "prefix" option docs are misleading
Type: behavior Stage:
Components: Documentation Versions: Python 3.7, Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: cykerway, docs@python, eric.snow, ncoghlan
Priority: normal Keywords:

Created on 2018-04-03 02:25 by cykerway, last changed 2018-04-06 14:30 by ncoghlan.

File name Uploaded Description Edit cykerway, 2018-04-04 23:19 Test program showing difference of behavior.
Messages (4)
msg314850 - (view) Author: Cyker Way (cykerway) * Date: 2018-04-03 02:25
The current implementation of `pkgutil.walk_packages()` is confusing. Users may be given incomplete results while no exception is raised if they don't explicitly provide the `prefix` parameter. The doc says:

>   prefix is a string to output on the front of every module name on output.

But the fact is, `prefix` is not merely an output formatter at all. This function cannot handle an empty prefix (which is the default) and will often encounter import errors which are then simply ignored without an `onerror` function (which is default again).

See test program for details.


msg314960 - (view) Author: Cyker Way (cykerway) * Date: 2018-04-04 23:19
Update test program.
msg315017 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-04-06 14:22
I think this is actually two distinct problems, one documentation one (which should be addressed in the online docs for all currently maintained versions), and one actual functional issue.

The documentation issue is the one you've reported: in order for the recursive descent to work in walk_packages given the current algorithm, then the combination of the given prefix, and the current global import configuration must allow that package to actually be imported. While there is a note about that limitation, it's currently thoroughly unclear.

The functional issue is two-fold:

1. pkgutil.iter_modules() doesn't identify PEP 420 namespace packages correctly (it ignores them as not being potential packages)

2. The recursive import to check pkg.__path__ uses a name based global __import__ rather than the more state independent technique

It's that second problem that introduces the "prefix must be set to get useful output" behaviour that you're currently seeing.
msg315018 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-04-06 14:30 is an existing issue for the PEP 420 limitation (it also notes why fixing that implicitly is a potential problem)

I've retitled this issue to be specifically about the misleading docs for the "prefix" option (and updated the metadata accordingly).

"Recursive descent in pkgutil.walk_packages depends on sys.path" would be a reasonable title for an issue to actually fix the operation to use a better algorithm based on newer importlib APIs, but I haven't filed that myself.
Date User Action Args
2018-04-06 14:30:40ncoghlansetassignee: docs@python

components: + Documentation, - Library (Lib)
title: pkgutil.walk_packages gives incomplete results -> pkgutil.walk_packages "prefix" option docs are misleading
nosy: + docs@python
versions: + Python 3.7
messages: + msg315018
2018-04-06 14:22:24ncoghlansetmessages: + msg315017
2018-04-04 23:19:18cykerwaysetfiles: +

messages: + msg314960
2018-04-04 23:16:05cykerwaysetfiles: -
2018-04-04 18:52:58brett.cannonsetnosy: - brett.cannon
2018-04-04 06:12:04ned.deilysetnosy: + brett.cannon, ncoghlan, eric.snow
2018-04-03 02:25:38cykerwaycreate