classification
Title: Incorrect coverage file from trace test_pickle.py
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: 9315 Superseder:
Assigned To: Nosy List: belopolsky, berker.peksag, terry.reedy
Priority: normal Keywords: patch

Created on 2010-07-21 01:54 by belopolsky, last changed 2016-04-21 14:31 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
pickle.cover belopolsky, 2010-07-21 01:54 test_pickle coverage file
traceme.py belopolsky, 2010-07-21 03:24
issue9317.diff belopolsky, 2010-07-21 04:43 review
issue9317.2.patch eli.bendersky, 2010-07-21 09:26
Messages (22)
msg110990 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 01:54
I am running the following command

$ python -m trace -C pickle-trace.d -c -m Lib/test/test_pickle.py

and getting attached file, pickle.cover, in the pickle-trace.d directory.  This does not look right.  From the very beginning, module level statements are marked as not executed:


>>>>>> __version__ = "$Revision: 82937 $"       # Code version
       
>>>>>> from types import FunctionType, BuiltinFunctionType
>>>>>> from copyreg import dispatch_table
>>>>>> from copyreg import _extension_registry, _inverted_registry, _extension_cache
>>>>>> import marshal
>>>>>> import sys

The method I was interested in, _instantiate, shows as if it was never executed even though if I add a failing assert, I do see test failure.

I suspect that this may have something to do with the import trickery that test_pickle does to test with and without _pickle.
msg110996 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 03:24
Does the trace module even work in 3.x?

I created attached trivial traceme.py file.

With python 2.7, I get 
$ python2 -m trace  -c -s traceme.py
lines   cov%   module   (path)
    1   100%   threading   (Lib/threading.py)
    6   100%   traceme   (traceme.py)

The first entry is clearly spurious, but traceme looks right.  With py3k, however, I get

$ python3 -m trace  -c -s traceme.py
lines   cov%   module   (path)
    1   100%   threading   (Lib/threading.py)

No traceme line at all.
msg110997 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 03:27
Even the following simple test file:

----------------------
import pickle

data = {'a' : [1, 2, 3], 'b': 5}
ps = pickle.dumps(data)
newdata = pickle.loads(ps)
print(newdata)
----------------------

When traced generates an empty trace file for the pickle module, so it's not test_pickle trickery, but rather something to do with the pickle module itself. Will investigate this further.
msg110998 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 03:28
I didn't hit refresh before posting so didn't see your message, Alexander. The problem indeed seems to be deeper.
msg111000 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 03:32
On Tue, Jul 20, 2010 at 11:28 PM, Eli Bendersky <report@bugs.python.org> wrote:
> The problem indeed seems to be deeper.
>

.. or shallower. :-)
msg111001 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 03:55
Here is another problem: with no file x in cwd,

$ ./python.exe -m trace  -c -f x traceme.py
Skipping counts file 'x': [Errno 2] No such file or directory: 'x'

This one seems to exist in both 2.7 and 3.x.  Creating empty counts file helps.
msg111002 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 04:00
Comparison of count files produced by 2.7 and 3.x suggests that the problem is in finding the source file for the module:

>>> pickle.load(open('x', 'rb'))
({('<string>', 2): 2, ('<string>', 1): 4, ('<string>', 8): 20, ..
>>> pickle.load(open('y', 'rb'))
({('traceme.py', 1): 2, ('traceme.py', 2): 2, ('<string>', 1): 2, ('traceme.py', 8): 20, ..

In the above, x was produced by 3.x and y by 2.7.
msg111003 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 04:13
trace.py uses settrace to do its heavy lifting. In the "local trace" function for counting, localtrace_count:

    def localtrace_count(self, frame, why, arg):
        if why == "line":
            filename = frame.f_code.co_filename
            lineno = frame.f_lineno
            key = filename, lineno
            self.counts[key] = self.counts.get(key, 0) + 1
        return self.localtrace

The line "filename = frame.f_code.co_filename" seems to work as expected in 2.x for the current file, returning its name. In 3.x it returns '<string>' for the current file.

This causes the trace module to filter out the results for the current file (traceme.py in Alexander's example).

I'm investigating further, but this may very well be caused by different behavior of `f_code.co_filename` between Python 2 and 3.
msg111004 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 04:22
On Wed, Jul 21, 2010 at 12:13 AM, Eli Bendersky <report@bugs.python.org> wrote:
..
> I'm investigating further, but this may very well be caused by different behavior
> of `f_code.co_filename` between Python 2 and 3.

I am afraid I am a step ahead of you.  The culprit is

           t.run('exec(%r)' % (script,))

line in 3.x instead of

            t.run('execfile(%r)' % (progname,))

in 2.x
msg111006 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 04:43
I am attaching a proof-of-concept patch, issue9317.diff, which fixes the file name problem at the expense of more spurious modules:

$ ./python.exe -m trace  -c -s traceme.py
lines   cov%   module   (path)
    7   100%   codecs   (Lib/codecs.py)
   10   100%   locale   (Lib/locale.py)
    4   100%   pkgutil   (Lib/pkgutil.py)
   58   100%   runpy   (Lib/runpy.py)
    6   100%   traceme   (traceme.py)
msg111027 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 09:26
Alexander,

I propose an alternative patch (attached issue9317.2.diff). It uses:

            with open(progname) as fp:
                code = compile(fp.read(), progname, 'exec')
                t.run(code)

Since the `run` method of Trace already accepts a code object and passes it to `exec`, I see no reason to do another `exec`.

This also fixes your 'traceme' problem, without introducing new module imports:


$ py3d -m trace -c -s traceme.py 
lines   cov%   module   (path)
    1   100%   threading   (/home/eliben/python_src/eliben-py3k/Lib/threading.py)
    6   100%   traceme   (traceme.py)
[47526 refs]
msg111044 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2010-07-21 13:00
Eli,

Your new patch makes perfect sense, but can you check of regular  
python invocation uses runpy these days. If it does, it may make sense  
to include it in trace at least optionally.

Also, does this fix the original problem?
msg111051 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 13:22
<<<<
Your new patch makes perfect sense, but can you check of regular  
python invocation uses runpy these days. If it does, it may make sense  
to include it in trace at least optionally.
>>>>

I'm not sure I understand what you mean by this. Could you please clarify?

<<<<
Also, does this fix the original problem?
>>>>

Alas, no.
msg111054 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2010-07-21 13:36
On Jul 21, 2010, at 9:22 AM, Eli Bendersky <report@bugs.python.org>  
wrote:

>
> Eli Bendersky <eliben@gmail.com> added the comment:
>
> <<<<
> Your new patch makes perfect sense, but can you check of regular
> python invocation uses runpy these days. If it does, it may make sense
> to include it in trace at least optionally.
>>>>>
>
> I'm not sure I understand what you mean by this. Could you please  
> clarify?
>

Sorry, typing on the phone.  s/of/if/ in my message above. What I mean  
is that as far as I know, when you run a script from command line,  
python loads it using functions in runpy.  Arguably this means that  
these functions should show up in coverage.
msg111055 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 13:45
>
> Sorry, typing on the phone.  s/of/if/ in my message above. What I mean
> is that as far as I know, when you run a script from command line,
> python loads it using functions in runpy.  Arguably this means that
> these functions should show up in coverage.
>

As far as I understand, when you run:

    py3d -m trace -C pickle-trace.d -c -m test_pickle.py

The first -m flag applies to the trace module. Python uses
runpy.run_module to load it, and then passes it its arguments (-C and
onwards).

Since runpy.run_module actually runs /before/ trace.py, trace.py has
no chance of setting its settrace hooks and trace runpy itself.

Moreover, since runpy is only the machinery to execute trace.py, does
it even make sense to include it in coverage?

What do you think?
msg111059 - (view) Author: Alexander Belopolsky (Alexander.Belopolsky) Date: 2010-07-21 14:06
On Wed, Jul 21, 2010 at 9:45 AM, Eli Bendersky <report@bugs.python.org> wrote:
..
> As far as I understand, when you run:
>
>    py3d -m trace -C pickle-trace.d -c -m test_pickle.py
>
> The first -m flag applies to the trace module. Python uses
> runpy.run_module to load it, and then passes it its arguments (-C and
> onwards).
>
When I said "run from command line", I meant something like

$ python Lib/test/test_pickle.py

I am not sure if this uses runpy machinery, but I suspect it does.

 The trace output should not include runpy calls that load the trace
module itself, but

$ python -m trace <flags> Lib/test/test_pickle.py

might show the runpy calls that occur when you run

$ python Lib/test/test_pickle.py

without tracing.   BTW, too bad that '-m' is taken, but I think
trace.py should grow a --run-module option so that you can trace
equivalents of python -m executions:

$ python -m trace <flags> --run-module test.test_pickle

> Moreover, since runpy is only the machinery to execute trace.py, does
> it even make sense to include it in coverage?
>
> What do you think?

I think you should ask python-dev. :-)  Note that most C/C++ coverage
tools do include  pre-main calls in coverage.  In case of C++, a lot
of things may happen before main().

Whatever the solution is, it should be unified between trace, profile,
and pdb modules.
msg111060 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 14:12
Eli,

At this point, I think it will be best to focus on unittests.  I am adding #9315 as a dependency here and will add a comment there.  I think once we have good unittest coverage the bug will become obvious.
msg111061 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-21 14:16
Also, when you get a chance, please separate the __main__ coverage bug into a separate issue and bring it up on python-dev.  We should get profile guys involved and I don't want them to be distracted by trace specific issues.
msg111068 - (view) Author: Eli Bendersky (eli.bendersky) * (Python committer) Date: 2010-07-21 14:34
1. I've created Issue 9323 to address the bug with __main__, with the proposed patch. 

2. Agreed about the unit tests (Issue 9315). I'll try to get the basic testing framework for trace.py created over the weekend.
msg118814 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-10-15 17:59
I have verified that the original issue is still present.  I will try to narrow it down to a smaller test case.
msg118818 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-10-15 18:32
I have found the cause of at least part of the issue.  Apparently, module level statements for some of the modules such as pickle do not show up in trace because they are imported by trace itself.  In other words, by the time traced script gets executed, pickle is already in sys.module and the code therein does not run.

Maybe we should consider clearing sys.modules in Trace.run, but this can cause problems if not done carefully.
msg263922 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-04-21 14:31
A patch similar to issue9317.2.patch has been applied in 0aa46b9ffba3. However, I noticed a regression and created issue 26818. I can confirm that this issue is fixed with the patch from issue 26818 applied.
History
Date User Action Args
2016-04-21 14:31:42berker.peksagsetstatus: open -> closed

nosy: + berker.peksag
messages: + msg263922

resolution: fixed
stage: test needed -> resolved
2013-10-08 13:11:00eli.benderskysetnosy: - eli.bendersky
2010-10-15 18:32:17belopolskysetmessages: + msg118818
2010-10-15 17:59:07belopolskysetmessages: + msg118814
2010-07-21 14:34:40eli.benderskysetmessages: + msg111068
2010-07-21 14:16:58belopolskysetmessages: + msg111061
2010-07-21 14:12:29belopolskysetnosy: - Alexander.Belopolsky
dependencies: + The trace module lacks unit tests
messages: + msg111060
2010-07-21 14:06:35Alexander.Belopolskysetmessages: + msg111059
2010-07-21 13:45:44eli.benderskysetmessages: + msg111055
2010-07-21 13:36:26Alexander.Belopolskysetmessages: + msg111054
2010-07-21 13:22:36eli.benderskysetmessages: + msg111051
2010-07-21 13:00:29Alexander.Belopolskysetnosy: + Alexander.Belopolsky
messages: + msg111044
2010-07-21 09:26:47eli.benderskysetfiles: + issue9317.2.patch

messages: + msg111027
2010-07-21 04:43:59belopolskysetfiles: + issue9317.diff
keywords: + patch
messages: + msg111006
2010-07-21 04:22:45belopolskysetmessages: + msg111004
2010-07-21 04:13:20eli.benderskysetmessages: + msg111003
2010-07-21 04:00:24belopolskysetmessages: + msg111002
2010-07-21 03:55:15belopolskysetmessages: + msg111001
2010-07-21 03:32:30belopolskysetmessages: + msg111000
2010-07-21 03:28:27eli.benderskysetmessages: + msg110998
2010-07-21 03:27:06eli.benderskysetmessages: + msg110997
2010-07-21 03:24:59belopolskysetfiles: + traceme.py

messages: + msg110996
2010-07-21 01:56:39belopolskysetnosy: + terry.reedy, eli.bendersky
2010-07-21 01:54:23belopolskycreate