Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broken pyc files #41838

Closed
arigo mannequin opened this issue Apr 10, 2005 · 13 comments
Closed

broken pyc files #41838

arigo mannequin opened this issue Apr 10, 2005 · 13 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@arigo
Copy link
Mannequin

arigo mannequin commented Apr 10, 2005

BPO 1180193
Nosy @loewis, @arigo, @pitrou
Files
  • update_co_filename.diff: patch against trunk revision 54933
  • update_co_filename.diff
  • update_co_filename.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2009-01-06.19:17:28.341>
    created_at = <Date 2005-04-10.13:10:52.000>
    labels = ['interpreter-core', 'type-feature']
    title = 'broken pyc files'
    updated_at = <Date 2009-01-06.19:17:28.340>
    user = 'https://github.com/arigo'

    bugs.python.org fields:

    activity = <Date 2009-01-06.19:17:28.340>
    actor = 'pitrou'
    assignee = 'none'
    closed = True
    closed_date = <Date 2009-01-06.19:17:28.341>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2005-04-10.13:10:52.000>
    creator = 'arigo'
    dependencies = []
    files = ['1674', '12598', '12599']
    hgrepos = []
    issue_num = 1180193
    keywords = ['patch']
    message_count = 13.0
    messages = ['24985', '24986', '24987', '24988', '24989', '24990', '24991', '24992', '61587', '79167', '79173', '79175', '79279']
    nosy_count = 5.0
    nosy_names = ['loewis', 'arigo', 'exarkun', 'zseil', 'pitrou']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1180193'
    versions = ['Python 3.1', 'Python 2.7']

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Apr 10, 2005

    In a number of situations, the .pyc files can become "corrupted" in a subtle way: the co_filename attribute of the code objects it contains become wrong. This can occur if we move or rename directories, or if we access the same set of files from two different locations (e.g. over NFS).

    This corruption doesn't prevent the .pyc files from working, but the interpreter looses the reference to the source file. It causes trouble in tracebacks, in the inspect module, etc.

    A simple fix would be to use the following logic when importing a .py file: if there is a corresponding .pyc file, in addition to checking the timestamp, check the co_filename attribute of the loaded object. If it doesn't point to the original .py file, discard the code object and ignore the .pyc file.

    Alternatively, we could force all co_filenames to point to the .py file when loading the .pyc file.

    I'll write a patch for whichever alternative seems better.

    @arigo arigo mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Apr 10, 2005
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 28, 2007

    I fail to see the corruption. It is quite desirable and normal to only ship pyc files - that the file name they refer to is actually present is not a requirement at all.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Mar 28, 2007

    What I called "corruption" is the situation
    where both the .py and the .pyc files are
    present, but the filename stored in the .pyc
    co_filenames is no longer the valid absolute
    path of the corresponding .py file, for any
    reason (renaming, NFS views, etc.).

    This situation causes the tracebacks and the
    inspect module to fail to locate the .py file,
    which I consider a bug.

    @zseil
    Copy link
    Mannequin

    zseil mannequin commented Apr 3, 2007

    This problem is reported quite often in the tracker,
    although it shows up in different places:

    http://www.python.org/sf/1666807
    http://www.python.org/sf/1051638

    I closed those bugs as duplicates of this one.

    The logging package is also affected:

    http://www.python.org/sf/1669498
    http://www.python.org/sf/1633605
    http://www.python.org/sf/1616422

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Apr 3, 2007

    If you ask me, I think that when the importing
    system finds both a .py and a .pyc for a module,
    then it should ignore all co_filename and replace
    them with the real path of the .py file. I can't
    see any point of not doing so.

    There are many other quirks caused by .pyc files
    accidentally remaining around, but we cannot fix them
    all as long as the .pyc files are at the same time
    a cache for performance reason and a redistributable
    program format (e.g. if "rm x.py" or "svn up" deletes
    a .py file, then the module is still importable via
    the .pyc left behind, a great way to oversee the fact
    that imports elsewhere in the project need to be
    updated).

    @zseil
    Copy link
    Mannequin

    zseil mannequin commented Apr 3, 2007

    Wouldn't your first solution be simpler? Changing all
    co_filenames would require either changing various
    marhal.c functions, or traversing the code object
    returned by import.c/read_compiled_module().

    Discarding the compiled code when the file names don't
    match would be simpler and only require minor changes
    in import.c/load_source_module().

    @zseil
    Copy link
    Mannequin

    zseil mannequin commented Apr 24, 2007

    Here is a patch that implements arigo's last suggestion.

    File Added: update_co_filename.diff

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented May 2, 2007

    It's an obscure detail, but I think that the
    .pyc file should not be rewritten again after we
    fix the co_filenames. Fixing the co_filenames
    is a very very cheap operation, and I can imagine
    cases where the same .py files are accessed from
    what appears to be two different paths, e.g. over
    NFS - this would cause .pyc files to be rewritten
    all the time, which is particularly bad if we
    have the example of NFS in mind. Not to mention
    that two python processes trying to write
    *different* data to the same .pyc file at the
    same time are going to create a mess, ending in
    a segfault the next time the broken .pyc is
    loaded.

    It's overall a mess, so let's play it safe.

    @tiran tiran added type-feature A feature request or enhancement labels Jan 5, 2008
    @pitrou
    Copy link
    Member

    pitrou commented Jan 23, 2008

    If code objects grew a __module__ attribute (which functions already
    have), wouldn't it be just a matter of falling back on
    sys.modules[my_code_object.__module__].__file__ when
    my_code_object.co_filename points to a non-existent file?

    @exarkun
    Copy link
    Mannequin

    exarkun mannequin commented Jan 5, 2009

    This is causing problems for me as well. The attached patch no longer
    applies cleanly to trunk. I've attached an updated version which
    addresses the conflicts. The new behavior fixes the issues I have with
    the current behavior. It'd be great to have it applied.

    If code objects grew a __module__ attribute (which functions already
    have), wouldn't it be just a matter of falling back on
    sys.modules[my_code_object.__module__].__file__ when
    my_code_object.co_filename points to a non-existent file?

    It'd be nice if it wasn't necessary to check to see if co_filename
    referred to an existing file. Can we have a solution which creates one
    definitive, correct way to determine the source file?

    @pitrou
    Copy link
    Member

    pitrou commented Jan 5, 2009

    As Armin said, I think it's safer and simpler not to rewrite the pyc
    file when the filenames have been changed.
    (if you thing changing the filenames can have a significant performance
    impact, you may want to benchmark it)

    @exarkun
    Copy link
    Mannequin

    exarkun mannequin commented Jan 5, 2009

    New version of the patch which doesn't rewrite pyc files attached.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 6, 2009

    Committed to trunk and py3k, and backported to 2.6 and 3.0. Thanks!

    @pitrou pitrou closed this as completed Jan 6, 2009
    @pitrou pitrou closed this as completed Jan 6, 2009
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants