This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eric.snow
Recipients barry, brett.cannon, eric.snow
Date 2021-10-21.16:52:05
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CALFfu7BVwCsThozk-DK_+qSovP0BpAQMbBKdyZmAs6ZzvK9TAA@mail.gmail.com>
In-reply-to <18C8915B-689A-49E5-BE0B-33C389540A87@python.org>
Content
On Wed, Oct 20, 2021 at 6:11 PM Barry A. Warsaw <report@bugs.python.org> wrote:
> I guess a question to answer then is whether we philosophically want the module attributes to be equivalent to the spec attributes.  And by equivalent, I mean enforced to be exactly so, and thus a proxy.  To me, the duplication is a wart that we should migrate away from so there’s only one place for these attributes, and that should be the spec.
>
> Here is the mapping we currently describe in the docs:
>
> mod.__name__ === __spec__.name
> mod.__package__ === __spec__.parent
> mod.__loader__ === __spec__.loader
> mod.__file__ === __spec__.origin
> mod.__path__ === __spec__.submodule_search_locations
> mod.__cached__ === __spec__.cached
>
> But right now, they don’t have to stay in sync, and I don’t think it’s reasonable to put the onus on the user to keep them in sync, because it’s unclear what code uses which attribute.  Okay, so you can just set them both to be safe, but then you can’t do that with __spec__.parent/__package__

Currently any of the module attrs can be different than the spec.  In
two cases they can legitimately be different: __name__ (with __main__)
and __file__ (with frozen stdlib modules).  For the rest, they should
be in sync.

Treating the spec as the single source of truth makes sense.  My only
concern has been that you can no longer determine how a module was
originally imported once the spec is changed.  However, I just
realized that you can always run importlib.util.find_spec() to
reproduce that info (with some minor caveats).  So now I'm less
concerned about that. :)

Notably, users have forever(?) been able to modify all of the module
attrs, with impact on the import system: __package__ and __path__
affecting later imports, and the rest affecting reload.

FWIW, an "advantage" of the module attrs is that they can be set in
the module code.  The same is true for the corresponding spec attrs
but with just enough indirection to require more intent.

Regardless, the idea of post-import modifications to modules/specs has
always made me uncomfortable.  As a user I'd expect an alternative
that feels less like a (non-obvious) low-level hack.

====

To me here are the important questions:

1. when does code ever modify the module attrs (or spec) and why?
2. should we distinguish the roles of the module attrs and spec
(how-module-was-loaded vs. how-module-will-reload vs.
how-module-impacts-other-imports)?
3. would it make sense to store spec modifications separately from the
spec (e.g. on the module)?
4. which attrs should be deprecated?
5. should any module attrs (the ones that don't get eventually
removed) be read-only?  What about spec attrs?
6. would it be better to provide importlib.util.* helpers to address
those needs, instead of having folks modify the module/spec directly?

My take:

1. that would be nice to know :)
2. that depends on what matters in practice.  My gut says the
distinctions aren't important enough to do anything about it, except
where there are legitimate differences between the module and spec.

Currently the module attrs cover all three roles.  The spec only
covers how-module-was-loaded (but is used as a fallback for the other
two roles in *most* cases).

Those two special cases, with __name__ and __file__ being out of sync,
are meaningful only for introspection, rather than affecting the
import machinery.  (In the case of frozen modules that have __file__
set, note that spec.has_location is False.)  I'm not sure how these
fit in with the different roles.

Advantages to keeping the spec exclusively how-module-was-imported:

* it's what I'd expect; having to call importlib.util.find_spec()
isn't the obvious thing
* the loader can modify the spec, so importlib.util.find_spec() won't
necessarily match

None of those appear important enough to warrant keeping the status
quo.  The disadvantages seem heavier (maintenance costs and user
confusion with (unnecessarily) having multiple sources of truth).

3. probably not, though it depends on (2)

However, if all those module attrs become read-only then we would need
to figure out where to store __name__ and __file__ in those special
cases.

4. everything except __name__ and __file__ (and probably __path__)
5. for modules, yes; for the spec, only if we stick with the one role

On modules I'd expect all of them to become properties regardless,
with most of them becoming read-only eventually:

getter:
* proxy the corresponding spec attr
* a deprecation warning if it isn't an attr that needs to stay

setter:
* proxy the corresponding spec attr
* a deprecation warning for now on all attrs
* a deprecation error later on all attrs
* an AttributeError even later (do not make it a data-only descriptor)

A bonus advantage of properties is that they would reduce clutter on
the module __dict__.

What about __path__?  We'll probably keep it as a traditional
indicator that the module is a package.  However, do we make it a
read-only proxy of spec.module_search_locations?  (We already use a
__path__ proxy for namespace packages.)

6. it probably isn't worth it.

Due to the extra indirection, modifying the spec seems like a more
deliberate (non-accidental or confused) action than changing the
module attrs.  That's probably enough "help".  However, in cases where
multiple attrs together have specific meaning, such helpers might be
helpful for users.

====

Regarding __file__ being different from spec.origin, it might be worth
revisiting the question of "origin" vs. "location" on the spec. Note
that, in the case of frozen stdlib modules, spec.has_location is False
even though __file__ is set.  That smells fishy to me.
History
Date User Action Args
2021-10-21 16:52:05eric.snowsetrecipients: + eric.snow, barry, brett.cannon
2021-10-21 16:52:05eric.snowlinkissue45540 messages
2021-10-21 16:52:05eric.snowcreate