This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Namespace packages have inconsistent __file__ and __spec__.origin
Type: Stage: resolved
Components: Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, eric.smith, eric.snow, maggyero, ned.deily, nedbat
Priority: normal Keywords: patch

Created on 2017-12-13 15:48 by barry, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 5481 merged barry, 2018-02-01 20:14
PR 5994 merged barry, 2018-03-05 19:23
Messages (8)
msg308206 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2017-12-13 15:48
Along the lines of Issue32303 there's another inconsistency in namespace package metadata.  Let's say I have a namespace package:

>>> importlib_resources.tests.data03.namespace
<module 'importlib_resources.tests.data03.namespace' (namespace)>

The package has no __file__ attribute, and it has a misleading __spec__.origin

>>> importlib_resources.tests.data03.namespace.__spec__.origin
'namespace'
>>> importlib_resources.tests.data03.namespace.__file__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'importlib_resources.tests.data03.namespace' has no attribute '__file__'

This is especially bad because the documentation for __spec__.origin implies a correlation to __file__, and says:

"Name of the place from which the module is loaded, e.g. “builtin” for built-in modules and the filename for modules loaded from source. Normally “origin” should be set, but it may be None (the default) which indicates it is unspecified."

I don't particularly like that its origin is "namespace".  That's an odd special case that's unhelpful to test against (what if you import a non-namespace package from the directory "namespace"?)

What would break if __spec__.origin were (missing? or) None?
msg311463 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2018-02-01 20:13
3.5 is in security fix only mode, and this is not a security issue.
msg312946 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-02-26 19:34
Note that this change was originally also backported to 3.6 in PR 5504 but, due to third-party package regressions discovered in pre-release testing, the 3.6 change was reverted in PR 5591 prior to release of 3.6.5rc1.
msg313242 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2018-03-05 11:09
Should this get an entry in the What's New?
msg313274 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2018-03-05 17:56
I guess it depends on whether you think this is a new feature or a bug fix.  Or, OTOH, since we had to revert for 3.6, maybe it makes sense either way since some code will be affected.
msg313277 - (view) Author: Ned Batchelder (nedbat) * (Python triager) Date: 2018-03-05 18:33
As is usual for me, I am here because some coverage.py code broke due to this change.  A diff between b1 and b2 found me the code change (thanks for the comment, btw!), but a What's New doesn't seem out of place.
msg313278 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2018-03-05 19:04
On Mar 5, 2018, at 10:33, Ned Batchelder <report@bugs.python.org> wrote:

> As is usual for me, I am here because some coverage.py code broke due to this change.  A diff between b1 and b2 found me the code change (thanks for the comment, btw!), but a What's New doesn't seem out of place.

Sounds good; I’ll work up a PR
msg330288 - (view) Author: Géry (maggyero) * Date: 2018-11-23 00:00
@barry You gave 2 reasons for changing __spec__.origin and __file__ for namespace packages.

Your 1st reason:

> I don't particularly like that its origin is "namespace".  That's an odd special case that's unhelpful to test against (what if you import a non-namespace package from the directory "namespace"?)

As far as I know, a non-namespace package always has an __init__.py file, so if it is imported from a directory named "namespace" it has a __spec__.origin and __file__ attributes equal to "path/to/package/namespace/__init__.py". So I don’t see the problem here with having a "namespace" origin for namespace package specs.

In addition, PEP 420 that introduced implicit namespace packages in Python 3.3 clearly stated that having no __file__ attribute was intended for namespace packages, and more generally was left a the discretion of the module’s loader and no more limited to built-in modules (https://www.python.org/dev/peps/pep-0420/#module-reprs):

> Previously, module reprs were hard coded based on assumptions about a module's __file__ attribute. If this attribute existed and was a string, it was assumed to be a file system path, and the module object's repr would include this in its value. The only exception was that PEP 302 reserved missing __file__ attributes to built-in modules, and in CPython, this assumption was baked into the module object's implementation. Because of this restriction, some modules contained contrived __file__ values that did not reflect file system paths, and which could cause unexpected problems later (e.g. os.path.join() on a non-path __file__ would return gibberish).
> This PEP relaxes this constraint, and leaves the setting of __file__ to the purview of the loader producing the module. Loaders may opt to leave __file__ unset if no file system path is appropriate. Loaders may also set additional reserved attributes on the module if useful. This means that the definitive way to determine the origin of a module is to check its __loader__ attribute.
> For example, namespace packages as described in this PEP will have no __file__ attribute because no corresponding file exists.

Your 2nd reason:

> This is especially bad because the documentation for __spec__.origin implies a correlation to __file__, and says:
> "Name of the place from which the module is loaded, e.g. “builtin” for built-in modules and the filename for modules loaded from source. Normally “origin” should be set, but it may be None (the default) which indicates it is unspecified."

I agree here, so why not updating the documentation instead of changing the implementation which followed PEP 420?
History
Date User Action Args
2022-04-11 14:58:55adminsetgithub: 76486
2018-11-23 00:00:21maggyerosetnosy: + maggyero, eric.snow
messages: + msg330288
2018-03-05 19:23:42barrysetpull_requests: + pull_request5761
2018-03-05 19:04:08barrysetmessages: + msg313278
2018-03-05 18:33:39nedbatsetmessages: + msg313277
2018-03-05 17:56:54barrysetmessages: + msg313274
2018-03-05 11:09:53nedbatsetnosy: + nedbat
messages: + msg313242
2018-02-26 19:34:41ned.deilysetnosy: + ned.deily

messages: + msg312946
versions: - Python 3.6
2018-02-03 04:21:56barrysetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-02-01 20:14:48barrysetkeywords: + patch
stage: patch review
pull_requests: + pull_request5312
2018-02-01 20:13:39barrysetmessages: + msg311463
versions: - Python 3.5
2018-02-01 15:55:46barrysetassignee: barry
versions: + Python 3.8
2017-12-13 15:57:06eric.smithsetnosy: + eric.smith
2017-12-13 15:48:41barrycreate