classification
Title: Stacktrace shows lines from updated file on disk, not code actually running
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Steve Pryde, jfine2358, scoder, steven.daprano, xtreak
Priority: normal Keywords:

Created on 2019-01-30 05:03 by Steve Pryde, last changed 2019-01-31 06:16 by steven.daprano.

Messages (11)
msg334544 - (view) Author: Steve Pryde (Steve Pryde) Date: 2019-01-30 05:03
When python prints or returns a stacktrace, it displays the appropriate line from the *current file on disk*, which may have changed since the program was run. It should instead show the lines from the file as it was when the code was executed.

Steps to reproduce:
1. Save the following code to a file and run it in a terminal.

    import time 

    time.sleep(600)

2. While it is running, insert a line before "time.sleep(600)", type some random characters, and save the file. The "time.sleep" should now be on line 4, and some random characters on line 3.

3. Now go back to the terminal and press Ctrl+C (to generate a stacktrace).


Expected output:

^CTraceback (most recent call last):
  File "py1.py", line 3, in <module>
    time.sleep(600)
KeyboardInterrupt


Actual output:

^CTraceback (most recent call last):
  File "py1.py", line 3, in <module>
    some random characters
KeyboardInterrupt
msg334550 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-01-30 09:34
> It should instead show the lines from the file as it was when the code was executed.

How is Python supposed to do that without making a copy of every module and script it runs just in case it gets modified?

(That's not a rhetorical question -- if you can think of a cheap way to implement this, I'm listening :-)

99.99% of the time this would be a total waste of time, so this would be a very expensive exercise for very little gain. Python's startup time is already too slow without having to also make potentially hundreds of file copies every time you run a script.

For the record, I too once ran into this issue. It left me utterly confused for an hour or so until I worked out what was happening, and then I never made the mistake of editing a running script again.
msg334551 - (view) Author: Steve Pryde (Steve Pryde) Date: 2019-01-30 10:46
> How is Python supposed to do that without making a copy of every module and script it runs just in case it gets modified?

Aha, I suspected this might be the reason. Feel free to close this issue if there's nothing else to be said here.

Python certainly only executes the code as it was at the time of initial execution, or rather it executes the generated bytecode. I gather therefore that python has enough info to display the files and line numbers but not their contents (which must be loaded fresh from disk).

As mentioned, if this is the case, I'm happy for this to be closed. Cheers.
msg334552 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-01-30 11:01
There may be something we can do to improve the error reporting and make 
it less perplexing:

https://mail.python.org/pipermail/python-ideas/2019-January/055041.html
msg334553 - (view) Author: Jonathan Fine (jfine2358) * Date: 2019-01-30 11:36
The problem, as I understand it, is a mismatch between the code object being executed and the file on disk referred to by the code object. When a module is reloaded it is first recompiled, if the .py file is newer than the .pyc file. (I've tested this at a console.)

Suppose wibble.py contains a function fn. Now do
   import wibble
   fn = wibble.fn
   # Modify and save wibble.py
   reload(wibble)
   fn()

It seems to me that
1) We have a mismatch between fn (in module __main__) and the file on disk.
2) Comparison will show that wibble.pyc is later than wibble.py.
3) There's no reliable way to discover that fn is not the current fn  ...
4) ... other than comparing its bytecode with that of the current value of wibble.fn.

Regarding (4) there might be another method. But I can't think of one that's reliable.
msg334554 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-01-30 11:37
See also issue8087 that has a similar case with interactive shell where the imported module was changed on disk. It has some discussion along similar lines with some more possible cases msg145291.
msg334555 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-01-30 11:42
Looking more into the issue8087 there is issue31300 and issue23594 that are linked as duplicates with issue31300 very similar to this report. Maybe close this to continue discussion on issue8087?
msg334579 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2019-01-30 17:35
I think the REPL could, when it formats a stack trace for printing, check every referenced source file if it's newer than its compiled .pyc (bytecode) file, and insert a warning into the stack trace if that is the case.

I don't see any use in doing this for all stack traces, so only ones that get printed out for the user could receive special treatment.

I also don't think we need to go further than that, e.g. check startup or module import time. Basically, whenever the source file is not in sync with the .pyc file, it's not unlikely that the code that is running corresponds to the .pyc file and no longer to the .py file.

Changing to target version to 3.8, since this is essentially a new feature and not acceptable as a bug fix for older versions.
msg334589 - (view) Author: Jonathan Fine (jfine2358) * Date: 2019-01-30 19:13
For information - all taken from docs and Lib/*.py

https://docs.python.org/3.7/library/traceback.html
traceback — Print or retrieve a stack traceback
Source code: Lib/traceback.py
===
This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace. This is useful when you want to print stack traces under program control, such as in a “wrapper” around the interpreter.
===

https://github.com/python/cpython/blob/3.7/Lib/traceback.py#L344-L359
===
        for f, lineno in frame_gen:
            co = f.f_code
            filename = co.co_filename
            name = co.co_name

            fnames.add(filename)
            linecache.lazycache(filename, f.f_globals)
            # Must defer line lookups until we have called checkcache.
            if capture_locals:
                f_locals = f.f_locals
            else:
                f_locals = None
            result.append(FrameSummary(
                filename, lineno, name, lookup_line=False, locals=f_locals))
        for filename in fnames:
            linecache.checkcache(filename)
===
By the way, here fnames is a set.

https://docs.python.org/3.7/library/linecache.html#module-linecache
linecache — Random access to text lines
===
The linecache module allows one to get any line from a Python source file, while attempting to optimize internally, using a cache, the common case where many lines are read from a single file. This is used by the traceback module to retrieve source lines for inclusion in the formatted traceback.
===

===
linecache.checkcache(filename=None)
Check the cache for validity. Use this function if files in the cache may have changed on disk, and you require the updated version. If filename is omitted, it will check all the entries in the cache.

linecache.lazycache(filename, module_globals)
Capture enough detail about a non-file-based module to permit getting its lines later via getline() even if module_globals is None in the later call. This avoids doing I/O until a line is actually needed, without having to carry the module globals around indefinitely.
===
msg334590 - (view) Author: Jonathan Fine (jfine2358) * Date: 2019-01-30 19:14
For information - all taken from docs and Lib/*.py

https://docs.python.org/3.7/library/traceback.html
traceback -- Print or retrieve a stack traceback
Source code: Lib/traceback.py
===
This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace. This is useful when you want to print stack traces under program control, such as in a “wrapper” around the interpreter.
===

https://github.com/python/cpython/blob/3.7/Lib/traceback.py#L344-L359
===
        for f, lineno in frame_gen:
            co = f.f_code
            filename = co.co_filename
            name = co.co_name

            fnames.add(filename)
            linecache.lazycache(filename, f.f_globals)
            # Must defer line lookups until we have called checkcache.
            if capture_locals:
                f_locals = f.f_locals
            else:
                f_locals = None
            result.append(FrameSummary(
                filename, lineno, name, lookup_line=False, locals=f_locals))
        for filename in fnames:
            linecache.checkcache(filename)
===
By the way, here fnames is a set.

https://docs.python.org/3.7/library/linecache.html#module-linecache
linecache -- Random access to text lines
===
The linecache module allows one to get any line from a Python source file, while attempting to optimize internally, using a cache, the common case where many lines are read from a single file. This is used by the traceback module to retrieve source lines for inclusion in the formatted traceback.
===

===
linecache.checkcache(filename=None)
Check the cache for validity. Use this function if files in the cache may have changed on disk, and you require the updated version. If filename is omitted, it will check all the entries in the cache.

linecache.lazycache(filename, module_globals)
Capture enough detail about a non-file-based module to permit getting its lines later via getline() even if module_globals is None in the later call. This avoids doing I/O until a line is actually needed, without having to carry the module globals around indefinitely.
===
msg334608 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-01-31 06:16
> For information - all taken from docs and Lib/*.py

I'm sorry Jonathon, I don't see how they are relevant or interesting to 
the topic in hand other than "they're used to print stack traces". Okay, 
they're used to print stack traces. And...? Can you explain further why 
they are of interest?
History
Date User Action Args
2019-01-31 06:16:57steven.dapranosetmessages: + msg334608
2019-01-30 19:14:25jfine2358setmessages: + msg334590
2019-01-30 19:13:40jfine2358setmessages: + msg334589
2019-01-30 17:35:49scodersetversions: + Python 3.8, - Python 3.7
nosy: + scoder

messages: + msg334579

type: behavior -> enhancement
2019-01-30 11:42:45xtreaksetmessages: + msg334555
2019-01-30 11:37:18xtreaksetnosy: + xtreak
messages: + msg334554
2019-01-30 11:36:16jfine2358setnosy: + jfine2358
messages: + msg334553
2019-01-30 11:01:04steven.dapranosetmessages: + msg334552
2019-01-30 10:46:47Steve Prydesetmessages: + msg334551
2019-01-30 09:34:19steven.dapranosetnosy: + steven.daprano
messages: + msg334550
2019-01-30 05:03:26Steve Prydecreate