classification
Title: Unupdated source file in traceback
Type: enhancement Stage:
Components: Interpreter Core Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Aigars Mahinovs, Jim.Jewett, chortos, dmascialino, eric.snow, ezio.melotti, jjconti, loewis, merwok, ncoghlan, rbcollins, ubershmekel
Priority: normal Keywords: patch

Created on 2010-03-07 19:40 by dmascialino, last changed 2017-08-29 14:03 by Aigars Mahinovs.

Files
File name Uploaded Description Edit
issue8087.patch dmascialino, 2011-10-06 01:50 review
Messages (16)
msg100600 - (view) Author: Diego Mascialino (dmascialino) Date: 2010-03-07 19:40
Example:

---------------- mod.py ----------------
def f():
    a,b,c = 1,2
    print b
----------------------------------------

If i do:

>>> import mod
>>> mod.f()

I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mod.py", line 2, in f
    a,b,c = 1,2
ValueError: need more than 2 values to unpack


If i fix the source:

---------------- mod.py ----------------
def f():
    a,b,c = 1,2,3
    print b
----------------------------------------

And do:

>>> mod.f()

I get:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mod.py", line 2, in f
    a,b,c = 1,2,3
ValueError: need more than 2 values to unpack

The problem is that the source shown is updated, but the executed code is old, because it wasn't reloaded.

Feature request:

If the source code shown was modified after import time and it wasn't reloaded, a warning message should be shown.

Example:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mod.py", line 2, in f      WARNING: Modified after import!
    a,b,c = 1,2,3
ValueError: need more than 2 values to unpack

or something like that. Maybe "use reload()" might appear.
msg100608 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-03-07 20:53
Displaying a warning whenever the code has changed on disk is clearly unacceptable - it is both expensive to check (and also, how frequent would you check?), and it would be annoying if you have long-running Python applications so this would just clutter the log files.

-1 on this request; users should just learn this aspect of Python. Also, it is not possible to get this perfect for many reasons, e.g. when you change a class, it will be impossible to update the code of existing instances even with reload.
msg100628 - (view) Author: Diego Mascialino (dmascialino) Date: 2010-03-08 05:28
Martin, I am not talking about a loop checking source file changes.

I think the check could be done only when the traceback is printed.

The function PyErr_Display() calls PyTracBack_Print() and so on until _Py_DisplaySourceLine() is called. The latter reads the source file and prints the line. I don't know which is the best way to know if the source file was updated. But i think that it's possible to check it there.

I think this could apply only when using the interactive console.
msg144982 - (view) Author: Diego Mascialino (dmascialino) Date: 2011-10-06 01:50
I worked a few hours today and I have this patch.
I tried to make a test but could not.

I know this is not a really good patch, but it's my first one and I wanted to show my idea.
msg145108 - (view) Author: Éric Araujo (merwok) * (Python committer) Date: 2011-10-07 17:14
I do not understand what the proposed feature is.  I also think it should be discussed on the python-ideas mailing list first: http://mail.python.org/listinfo/python-ideas
msg145141 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-07 20:25
I'm not sure this is useful to have.  If you changed your code you know that you have to reload, so why would you want a warning that tells you that you changed the code?

For some reason I always had the opposite problem (i.e. after a reload the traceback was still showing the original code, and not the new one), while IIUC you are saying that it shows the new code even if the module is not reloaded.
I tried your code and indeed it does what you say, so either I am mistaken and I've been misreading the tracebacks, or this changed from 2.6 to 2.7, or in some cases even the behavior (I think) I observed might happen.
I'll have to verify this next time it happens.
msg145167 - (view) Author: Diego Mascialino (dmascialino) Date: 2011-10-08 12:42
On Fri, Oct 7, 2011 at 5:25 PM, Ezio Melotti <report@bugs.python.org> wrote:
>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> I'm not sure this is useful to have.  If you changed your code you know that you have to reload, so why would you want a warning that tells you that you changed the code?

The source line showed in the traceback could not be the same line
executed.

Take a look to this example:

k.py:
def f():
    a,b,c = 1,2

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "k.py", line 2, in f
    a,b,c = 1,2
ValueError: need more than 2 values to unpack

k.py:
def f():
    # blah
    a,b = 1,2

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "k.py", line 2, in f
    # blah
ValueError: need more than 2 values to unpack

> For some reason I always had the opposite problem (i.e. after a reload the traceback was still showing the original code, and not the new one), while IIUC you are saying that it shows the new code even if the module is not reloaded.
> I tried your code and indeed it does what you say, so either I am mistaken and I've been misreading the tracebacks, or this changed from 2.6 to 2.7, or in some cases even the behavior (I think) I observed might happen.
> I'll have to verify this next time it happens.

That is strange, I think Python does not save the original code in any place.
msg145168 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-08 13:03
> The source line showed in the traceback could not be the same line executed.

My point is that when I see an error and modify the source to fix it, then I know that I'll have to reload.  If for some reason I forget to reload, I'll get the wrong line in the traceback and then reload, but I don't think that ever happened to me.
So to me, your warning will only be useful in the case where I modified the source, forgot to reload and got the same error again with a wrong line displayed.  Also note that reloading is not so common; usually you just restart your application and that will give you the right traceback.

Also I'm not sure the warning you proposed is the best way to handle this.
msg145173 - (view) Author: Diego Mascialino (dmascialino) Date: 2011-10-08 15:52
> So to me, your warning will only be useful in the case where I modified the source, forgot to reload and got the same error again with a wrong line displayed. Also note that reloading is not so common; usually you just restart your application and that will give you the right traceback.

I know the case when this happens is really unsusual, but the
interperter could be able to alert you than that line of the traceback
is wrong.

> Also I'm not sure the warning you proposed is the best way to handle this.

I agree that, another approach is to save a copy of the source file
associated to the compiled code (like a .pys).
msg145291 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-10-10 07:52
(just reviewing the idea here, not the current patch)

The problem of "stale code" (i.e. what was executed doesn't match what is displayed in the traceback) is a tricky and subtle one. There are a few different cases:

1. Source displayed does not match source on disk
- these cases do happen, but they're almost always due to straight up bugs in the linecache or traceback modules.
2. Source has been changed, but module has not been reloaded
- this is the case for "edited source file but forgot to reload module". I've certainly forgotten to do this myself, and I'm far from the only one. This is the case Diego's RFE targets, and I think it has some merit.
3. Source has been changed, module has been reloaded, but object reference is from previous version of module
- the patch doesn't detect this. There are various ways we *could* detect it, but they all involve some fairly significant changes to the way compilation and module import work.

Aspect 3 is a much deeper (and bigger) problem relating to native introspection support in function and class objects. But that doesn't make Diego's idea to improve Aspect 2 invalid - there have certainly been times when playing at the interactive prompt that such a warning would have reminded me that I needed to reload the module I was working on.
msg147142 - (view) Author: Oleg Oshmyan (chortos) Date: 2011-11-06 04:52
I just want to note that the code might be edited not only while it is running it in the interactive interpreter but also while it is running as a standalone script. In this case the script naturally would not know to reload its own code nor allow the user to do it manually, and even if it would, some objects created from the old code would probably remain alive.

Here is a simple shell script that causes Python to show a stack traceback with '# This is just a comment' as the offending line by launching it in the background and then overwriting the Python code. What happened in reality is that I launched a front-end tool for video transcoding I had written in Python and after a while tweaked the code while it was still running. When the actual encode finished, my (buggy) clean-up code raised an exception and I got a traceback saying it had been raised in a line that only had a comment in it.

cat <<EOF >proof.py
import time
time.sleep(5)
raise Exception
EOF
python proof.py &
sleep 1
cat <<EOF >proof.py


# This is just a comment
EOF
sleep 5
msg153345 - (view) Author: Jim Jewett (Jim.Jewett) Date: 2012-02-14 16:01
Martin v. Löwis (loewis) wrote:

> Displaying a warning whenever the code has changed on disk is
> clearly unacceptable

As clarified, the request is only for when a traceback is being created (or perhaps even only for when one is being printed).  

I agree that we don't want to watch every file every time any code is run, but by the time a traceback is being displayed, any tight loops are ending.

Nick Coghlan (ncoghlan) wrote:

> There are a few different cases: ...
> 2. Source has been changed, but module has not been reloaded ...
> 3. Source has been changed, module has been reloaded, but object ...

Given that a traceback is being displayed, I think it is reasonable to rerun the find-module portion of import, and verify that there is not stale byte-code.  

Frankly, I think it would be worth storing a file timestamp on modules, and verifying that whatever-would-be-imported-if-imported-now matches that timestamp.  This would also catch case (3).

I also think that -- on traceback display -- it might be worth verifying that the code's __globals__ is the __globals__ associated with the module of that name in sys.modules.  This would warn about some intentional manipulations, but would catch case (3) even more accurately.
msg237297 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2015-03-05 22:20
Storing a marker in module objects which can be used to validate the linecache is a good idea. timestamp isn't appropriate because of modules loaded from zips or dynamic generation. I'd suggest we make it something opaque - we get source code by asking the loader for it anyway - if we make all source code loading go through the loader, then file based loaders can use a timestamp, zip based ones can shortcircuit or use a reference to the containing zip's file metadata or whatever.
msg238174 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2015-03-16 02:15
Oh, it may be clear to everyone already but its perhaps worth noting: there are two ways the cache can skew.

(older source): We may have a newer file compiled and in use and the older source in the cache.

e.g. someone calls linecache.getlines(foo.py), then imports foo, and in between the contents of foo.py changed.

(newer source) Or we may have an older file compiled and in use, and newer source in the cache.

e.g. someone imports foo, the contents of foo.py change, and then someone calls linecache.getlines(foo.py).

One way to address the older source case would be to make importing (also compile()) flush the imported file out of the cache. That doesn't need any special logic or handling - just a new parameter to clearcache() to specify  a file (or files?) to evict.

The newer source case is what needs some logic, and for that, as I said earlier, I think we need something opaque. Perhaps since we have importlib now, if we limit ourselves to considering actual python modules (linecache can technically cache anything with lines in it) we can rely on the import machinery get_source() to handle this: if the source has changed get_source should return None.

That way we don't need to change the data structure of modules at all - only packages for which there is no 302 loader will be able to suffer cache skew.

As far as informing callers that this situation has occurred, I think we'll need to think a bit about that, as linecache already signals 'source not available' by returning None, which doesn't leave much room to include an error message. We could start raising exceptions up the stack - which implies modifying PEP-302 (e.g. via a new PEP) and working up the stack.
msg238175 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2015-03-16 02:15
Oh, meant to add - we could just call logging.warning or something.
msg300990 - (view) Author: Aigars Mahinovs (Aigars Mahinovs) Date: 2017-08-29 14:03
As a point of update - this is a problem that can make debugging more difficult for people running web services or other server processes that are expected to stay alive for a long time. It is often undesirable to force a server restart or reload every time a new code base is deployed, so you can get into situations where code running does not match the code on disk. 

If you get a crash and a traceback in such situation, you will be very confused.

As discussed already, this could be solved by taking a hash of the source code at module import time and then comparing that to the code on disk during traceback printing time - if there is a difference, then print a big fat warning that the traceback may be unreliable.

I just lost two days tracking down weird issues that in the end were caused by server failing to reload code when it should have done so. Having such warning on the first traceback would have alerted me to the problem much earlier.
History
Date User Action Args
2017-08-29 14:03:54Aigars Mahinovssetnosy: + Aigars Mahinovs
messages: + msg300990
2017-08-29 13:42:20r.david.murraylinkissue31300 superseder
2015-03-16 02:15:23rbcollinssetmessages: + msg238175
2015-03-16 02:15:03rbcollinssetmessages: + msg238174
2015-03-05 22:30:08ned.deilylinkissue23594 superseder
2015-03-05 22:20:12rbcollinssetnosy: + rbcollins
messages: + msg237297
2013-02-12 19:21:07floxlinkissue17195 superseder
2012-02-14 16:01:47Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg153345
2012-02-14 13:42:22ubershmekelsetnosy: + ubershmekel
2012-02-14 13:17:51floxlinkissue14008 superseder
2011-11-06 04:52:02chortossetnosy: + chortos
messages: + msg147142
2011-10-10 18:49:13eric.snowsetnosy: + eric.snow
2011-10-10 07:52:53ncoghlansetnosy: + ncoghlan
messages: + msg145291
2011-10-08 15:52:42dmascialinosetmessages: + msg145173
2011-10-08 13:03:02ezio.melottisetmessages: + msg145168
2011-10-08 12:42:22dmascialinosetmessages: + msg145167
2011-10-07 20:25:04ezio.melottisetnosy: + ezio.melotti
messages: + msg145141
2011-10-07 17:14:43merwoksetnosy: + merwok
messages: + msg145108
2011-10-06 01:50:27dmascialinosetfiles: + issue8087.patch
keywords: + patch
messages: + msg144982

versions: - Python 2.7
2010-03-08 05:28:58dmascialinosetmessages: + msg100628
2010-03-07 20:53:10loewissetnosy: + loewis
messages: + msg100608
2010-03-07 19:40:07dmascialinocreate