Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add __file__ attribute to frozen modules #65935

Open
malemburg opened this issue Jun 12, 2014 · 13 comments
Open

Add __file__ attribute to frozen modules #65935

malemburg opened this issue Jun 12, 2014 · 13 comments
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir topic-importlib type-bug An unexpected behavior, bug, or error

Comments

@malemburg
Copy link
Member

BPO 21736
Nosy @malemburg, @gvanrossum, @warsaw, @gpshead, @ncoghlan, @ericsnowcurrently
PRs
  • bpo-45020: Freeze some of the modules imported during startup. #28335
  • bpo-45020: Identify which frozen modules are actually aliases. #28655
  • bpo-21736: Set __file__ on frozen stdlib modules. #28656
  • Files
  • file-for-frozen-modules.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-06-12.16:14:48.985>
    labels = ['interpreter-core', 'type-bug', 'library', '3.11']
    title = 'Add __file__ attribute to frozen modules'
    updated_at = <Date 2021-10-28.19:08:33.132>
    user = 'https://github.com/malemburg'

    bugs.python.org fields:

    activity = <Date 2021-10-28.19:08:33.132>
    actor = 'eric.snow'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core', 'Library (Lib)']
    creation = <Date 2014-06-12.16:14:48.985>
    creator = 'lemburg'
    dependencies = []
    files = ['35595']
    hgrepos = []
    issue_num = 21736
    keywords = ['patch']
    message_count = 13.0
    messages = ['220362', '220365', '220367', '220368', '220586', '220598', '220623', '401906', '403953', '403954', '404168', '404228', '405228']
    nosy_count = 6.0
    nosy_names = ['lemburg', 'gvanrossum', 'barry', 'gregory.p.smith', 'ncoghlan', 'eric.snow']
    pr_nums = ['28335', '28655', '28656']
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21736'
    versions = ['Python 3.11']

    @malemburg
    Copy link
    Member Author

    The missing __file__ attribute on frozen modules causes lots of issues with the stdlib (see e.g. bpo-21709 and the stdlib test suite) and other tools that expect this attribute to always be present.

    The attached patch for 3.4.1 adds this attribute to all frozen modules and resolves most issues. It cannot resolve the problem of not necessarily finding the directories/files listed in those attributes, but at least makes it possible to continue using code that only uses the attribute for error reporting.

    @malemburg malemburg added interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir labels Jun 12, 2014
    @warsaw
    Copy link
    Member

    warsaw commented Jun 12, 2014

    I'm -0 on this patch. I can understand that in some sense, frozen modules do semantically have an associated file, but OTOH, once they're frozen the connection to their file is broken. Also, I think anything that assumes __file__ exists is simply broken and should be fixed. There are other cases than frozen modules where a module would have no reasonable value for __file__ and thus shouldn't have one.

    @malemburg
    Copy link
    Member Author

    On 12.06.2014 18:35, Barry A. Warsaw wrote:

    I'm -0 on this patch. I can understand that in some sense, frozen modules do semantically have an associated file, but OTOH, once they're frozen the connection to their file is broken. Also, I think anything that assumes __file__ exists is simply broken and should be fixed. There are other cases than frozen modules where a module would have no reasonable value for __file__ and thus shouldn't have one.

    This one falls into the practicality beats purity category. Of
    course, the __file__ attribute doesn't always makes sense as
    file path, but it does serve an information purpose.

    We're doing this in eGenix PyRun to get 3rd party code working
    (including parts of the Python stdlib :-)). Not doing so
    would simply lead to the whole freezing approach pretty much
    useless, since so much code uses the attribute without checking
    or providing a fallback solution.

    @warsaw
    Copy link
    Member

    warsaw commented Jun 12, 2014

    PBP might be reasonably used to justify it for the frozen case. I just don't want to use that as a wedge to define __file__ in *all* cases, even when no reasonable file name exists.

    @ericsnowcurrently
    Copy link
    Member

    __file__ is the filename from which the module *was* loaded (the inspect doc [1] should probably reflect that [2]). The import machinery only uses the module's __spec__ (origin, etc.). __file__ is set strictly as informational (and for backward-compatibility).

    Per the language reference [3], __file__ may be omitted when it does not have semantic meaning. It also says "Ultimately, the loader is what makes use of __file__", but that hasn't been accurate since PEP-451 landed. [4] Notably, though, for now the module __repr__() *does* use __file__ if it is available (and the loader doesn't implement module_repr). The counterpart of __file__ within a module's spec is __spec__.origin. The two should stay in sync. In the case of frozen modules origin is set to "frozen".

    Giving __file__ to frozen modules is inaccurate. The file probably won't be there and the module certainly wasn't loaded from that location.

    Stdlib modules should not rely on all module's having __file__. Removing __file__ from frozen modules was a change in 3.4 and I'd consider it a 3.4 bug if any stdlib module relied on it. For that matter, I'd consider it a bug if a module relied on all modules having __file__ or even __file__ being set to an actual filename.

    Would it be inappropriate to set an additional informational attribute on frozen modules to indicate the original path? Something like __frozen_filename__. Then you wouldn't need to rely on __code__.co_filename.

    p.s. Searching for __file__ in the docs [5] illustrates its prevalence.

    [1] https://docs.python.org/3/library/inspect.html#types-and-members
    [2] issue bpo-21760
    [3] https://docs.python.org/3/reference/import.html#__file__
    [4] issue bpo-21761
    [5] https://docs.python.org/3/search.html?q=__file__

    @ncoghlan
    Copy link
    Contributor

    Can we just drop "__file__" and set the origin for frozen modules to
    something that includes the original file name?

    @malemburg
    Copy link
    Member Author

    On 15.06.2014 01:53, Nick Coghlan wrote:

    Can we just drop "__file__" and set the origin for frozen modules to
    something that includes the original file name?

    This wouldn't really help, because too much code out there uses
    the __file__ attribute and assumes it's always available.

    Note that the filename information is already available in the
    code object's co_filename attribute.

    @gvanrossum
    Copy link
    Member

    Note that the filename information is already available in the
    code object's co_filename attribute.

    Hm, in the latest (3.11) main that's not true -- co_filename is set to something like "<frozen ntpath>". And that's about right, since the filename contained in the frozen data can at best reflect the filename at the time the freeze script ran, which is not necessarily the same as the filename when the user runs the Python binary.

    I don't think we should try to set __file__ or any other attributes on frozen modules (at least not for the small set of frozen modules we're contemplating for 3.11). User who need the file can run their Python 3.11 binary with -Xfrozen_modules=off, to disable the use of frozen modules (other than the three importlib bootstrap files).

    Tools that freeze the entire stdlib or 3rd party code may have to make a different choice, but that's not our problem (yet).

    @ericsnowcurrently
    Copy link
    Member

    New changeset 79cf20e by Eric Snow in branch 'main':
    bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656)
    79cf20e

    @ericsnowcurrently
    Copy link
    Member

    I've merged the change to add __file__ for frozen stdlib modules (when possible) and to set __path__ appropriately. There are still a number of things to address though:

    • set co_filename (for tracebacks)
    • implement FrozenImporter.get_filename()
    • implement FrozenImporter.get_source()

    And then there's the question of supporting __file__, etc. for custom frozen modules. Presumably all those things are still covered by this issue so I'm leaving it open.

    @ericsnowcurrently ericsnowcurrently added the 3.11 only security fixes label Oct 14, 2021
    @gpshead
    Copy link
    Member

    gpshead commented Oct 18, 2021

    That appears to have caused https://bugs.python.org/issue45506

    @warsaw
    Copy link
    Member

    warsaw commented Oct 18, 2021

    Weird. PR 28655 is merged on GH, but still shows open on this bpo ticket.

    @ericsnowcurrently
    Copy link
    Member

    I've opened the following issues related to frozen stdlib modules:

    https://bugs.python.org/issue45652
    https://bugs.python.org/issue45658
    https://bugs.python.org/issue45659

    Again, I'm leaving this issue open to deal with the broader question of frozen modules outside the stdlib, where the solution isn't so obvious. The question of targeting the FileLoader (or SourceLoader) ABC should probably be handled separately. (Note that bpo-45020 has some related discussion.)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) stdlib Python modules in the Lib dir topic-importlib type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants