classification
Title: Add SystemTap static markers
Type: enhancement Stage: patch review
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: out of date
Dependencies: Superseder: Systemtap and DTrace support
View: 21590
Assigned To: Nosy List: Robert.Buchholz, benesch, bkabrda, cburroughs, christian.heimes, dmalcolm, fche, jcea, loewis, merwok, mjw, pitrou, scox, xmorel
Priority: normal Keywords: needs review, patch

Created on 2012-05-10 21:05 by dmalcolm, last changed 2016-09-10 02:15 by lukasz.langa. This issue is now closed.

Files
File name Uploaded Description Edit
cpython-systemtap-2012-05-10-001.patch dmalcolm, 2012-05-10 21:05 Patch against devel branch to add systemtap static marker support to CPython 3.3 review
cpython-systemtap-2012-05-11-001.patch dmalcolm, 2012-05-11 20:49 Patch against devel branch to add systemtap static marker support to CPython 3.3 review
devguide-systemtap-2012-05-11-001.patch dmalcolm, 2012-05-11 20:50 Patch against devguide to add description of systemtap static marker support
cpython-systemtap-2012-05-16-001.patch dmalcolm, 2012-05-16 18:15 review
cpython-systemtap-2012-06-21-001.patch dmalcolm, 2012-06-21 18:22 Updated version of patch review
cpython-systemtap-2014-05-19.patch bkabrda, 2014-05-19 11:10 review
cpython-systemtap-2014-05-19-all-files.patch bkabrda, 2014-05-19 15:05 review
Messages (28)
msg160374 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2012-05-10 21:05
I'm attaching a patch which adds static markers for SystemTap for two probeable events within CPython's bytecode interpreter, along with test cases and documentation.

I'm hoping to get this merged for 3.3; is this PEP-worthy, or can this be done through patch review?

Note: issue #4111 was originally opened on 2008-10-12 as "Add DTrace probes", and was generalized on 2009-12-08 to "Add Systemtap/DTrace probes".  That issue was closed on 2011-11-14 to be superceded by issue #13405 ("Add DTrace probes"), hence I'm opening this as a separate issue.

I believe that although DTrace and SystemTap are similar, they are sufficiently different from each other that it's going to take separate work to support one or the other (and that the maintainers of the support for each within CPython are likely to be different people).  I hope that once one of them is merged, the other will become a lot easier to merge.

Potentially other markers could be added for other events (the DTrace patch in issue #13405 has gained some), but I wanted to start small for now.
msg160388 - (view) Author: Éric Araujo (merwok) * (Python committer) Date: 2012-05-10 23:49
I think the doc may be more at home in the Developer’s Guide (like the doc about GDB) rather that the new-user-oriented “Setup and Usage” doc.
msg160389 - (view) Author: Éric Araujo (merwok) * (Python committer) Date: 2012-05-10 23:51
BTW if you don’t get feedback in a week or two you could go to python-dev; if there is no controversy nor implications on third-party code this will likely not require a PEP, only agreement between the core devs.  Good luck!
msg160400 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-11 04:53
Well, at least it looks much cleaner that the dtrace patch.

Two comments still:
- is pysystemtap.h meant to be a public header? otherwise it should perhaps not end up in Include (I'm not sure what our policy is here)
- perhaps get_frame_marker_info() and friends could still be a in separate ceval-systemtap.h file? that way the only added complication in ceval.c would be a couple of includes, and the two `if`s in the eval loop
msg160414 - (view) Author: Mark Wielaard (mjw) Date: 2012-05-11 15:02
Just a comment that newer [eu]-readelf -n will provide a nicer view of the sdt ELF notes. You might want to suggest that in the documentation.

e.g. $ eu-readelf -n /usr/lib64/libpython2.7.so

Note section [ 1] '.note.gnu.build-id' of 36 bytes at offset 0x190:
  Owner          Data size  Type
  GNU                   20  GNU_BUILD_ID
    Build ID: a28f8db1b224530b0d38ad7b82a249cf7c3f18d6

Note section [27] '.note.stapsdt' of 184 bytes at offset 0x1ae884:
  Owner          Data size  Type
  stapsdt               70  Version: 3
    PC: 0xe0d3a, Base: 0x14b150, Semaphore: 0x3ae882
    Provider: python, Name: function__return, Args: '8@%rbx 8@%r13 -4@%eax'
  stapsdt               69  Version: 3
    PC: 0xe0f37, Base: 0x14b150, Semaphore: 0x3ae880
    Provider: python, Name: function__entry, Args: '8@%rbx 8@%r13 -4@%eax'
msg160448 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2012-05-11 20:49
Thanks Eric, Antoine and Mark.

I'm attaching two new patches: a replacement patch for cpython, and a new patch for the devguide

I've moved the docs to the dev guide, starting a new "Debugging and Instrumentation" section there.

Changes to the cpython patch:
  * fixed a bug in configure.in that was enabling systemtap support even without --with-systemtap (if the devel toolchain was present)
  * I added an initial check to test_systemtap to skip the tests unless Python was configured --with-systemtap
  * pysystemtap.h is not meant to be public, so I've moved the source pysystemtap.d and generated header pysystemtap.h from Include/ to Python/.  I also simplified pysystemtap.d (removed the #pragma lines, since I believe they're DTrace-specific).
  * I've introduced a Python/ceval_systemtap.h private header as suggested by Antoine, moving things in there to simplify the changes to Python/ceval.c

Changes to the devguide docs:
  * removed the ".. impl-detail" as this only seems to work (and be appropriate) in the cpython Doc build, not in devguide.
  * added "eu-readelf -n" example from Mark

The docs refer to the low-level way of doing things (using the markers directly), but don't yet spell out the higher-level way (creating a tapset).  I've left this out of the patches for now to keep the patches simpler.
msg160883 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-16 16:41
I tried the patch under Mageia 1 and got the following failure:

test_systemtap skipped -- Test systemtap script did not run; stderr was: b"Pass 1: parsed user script and 72 library script(s) using 56252virt/20964res/1828shr kb, in 70usr/0sys/82real ms.\nPass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) using 56648virt/21492res/1900shr kb, in 10usr/0sys/3real ms.\nPass 3: using cached /home/antoine/.systemtap/cache/15/stap_155c3565481f113c929ad94e10c2b48e_779.c\nPass 4: using cached /home/antoine/.systemtap/cache/15/stap_155c3565481f113c929ad94e10c2b48e_779.ko\nPass 5: starting run.\nPass 5: run completed in 0usr/0sys/3real ms.\nPass 5: run failed.  Try again with another '--vp 00001' option.\n"


(I also had to chmod +s staprun - scary :-))
msg160905 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2012-05-16 18:15
> I tried the patch under Mageia 1 and got the following failure:
Thanks.  I'm attaching an updated patch which reworks test_systemtap so that
  (a) it turns up the verbosity of stap invocations from "-v" to "-vv"
  (b) it tests the "hello world" stap script on a trivial binary ("true") before attempting to invoke sys.executable under stap

What does the error message look like with the extra verbosity?

> (I also had to chmod +s staprun - scary :-))
The chmod invocation sounds like an issue with how Mageia have packaged SystemTap, assuming that you're using a packaged build of systemtap.
msg160907 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-16 18:29
Still fails. So I tried the script by hand and got:


$ stap - -vv -c true --vp 00001
Created temporary directory "/tmp/stapBvo9zS"
SystemTap translator/driver (version 1.3/0.152 non-git sources)
Copyright (C) 2005-2010 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: AVAHI TR1_UNORDERED_MAP
Session arch: x86_64 release: 2.6.38.8-desktop-10.mga
probe begin { println("hello world") exit () }
Searched "/usr/share/systemtap/tapset/x86_64/*.stp", found 3
Searched "/usr/share/systemtap/tapset/*.stp", found 69
Pass 1: parsed user script and 72 library script(s) using 56252virt/20976res/1840shr kb, in 80usr/0sys/5091real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) using 56648virt/21504res/1912shr kb, in 0usr/0sys/3real ms.
Pass 3: using cached /home/antoine/.systemtap/cache/15/stap_155c3565481f113c929ad94e10c2b48e_779.c
Pass 4: using cached /home/antoine/.systemtap/cache/15/stap_155c3565481f113c929ad94e10c2b48e_779.ko
Pass 5: starting run.
Running /usr/bin/staprun -v -v -c 'true' /tmp/stapBvo9zS/stap_155c3565481f113c929ad94e10c2b48e_779.ko
staprun:main:268 modpath="/tmp/stapBvo9zS/stap_155c3565481f113c929ad94e10c2b48e_779.ko", modname="stap_155c3565481f113c929ad94e10c2b48e_779"
staprun:init_staprun:206 init_staprun
staprun:insert_module:60 inserting module
staprun:insert_module:79 module options: _stp_bufsize=0
Spawn waitpid result (0x8652802): 255
Pass 5: run completed in 0usr/0sys/3real ms.
Pass 5: run failed.  Try again with another '--vp 00001' option.
Running rm -rf /tmp/stapBvo9zS
Spawn waitpid result (0x802): 0


No obvious error message. Who the hell designs such UIs? This seems as stupidly unfriendly as dtrace...
msg160909 - (view) Author: Frank Ch. Eigler (fche) Date: 2012-05-16 18:35
Hi -

On Wed, May 16, 2012 at 06:29:09PM +0000, Antoine Pitrou wrote:
> [...]
> No obvious error message. Who the hell designs such UIs? This seems
> as stupidly unfriendly as dtrace...

Sorry about that.  You're running a near-two-year-old version of
systemtap (1.3); error messages (and of course much functionality) has
improved since.

- FChE
msg160911 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-16 18:38
> Sorry about that.  You're running a near-two-year-old version of
> systemtap (1.3); error messages (and of course much functionality) has
> improved since.

Fair enough. I think the main patch looks go to go. It's clean and
doesn't obscure core parts of the interpreter, which is enough as far as
I'm concerned :-)

As for the devguide patch, I wonder if the devguide's the right place:
do we want to document it for developers of Python, or for everyone?
msg161012 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2012-05-17 20:13
Thanks.   I too am wondering where the documentation should go.  We currently have several locations for documentation:
  (1) the man page
  (2) the "Doc" subdirectory aka docs.python.org
  (3) various text files in the source tree, such as Misc/SpecialBuilds.txt
  (4) the devguide
  (5) www.python.org, and other web sites

This feature is about instrumentation of CPython, which is of use to several audiences, such as:
  (a) people working on CPython's internals
  (b) Python developers seeking to understand performance issues
  (c) sysadmins trying to tweak the performance of, say, a deployed web app

As I understand it, the devguide is targetting (a), whereas the Doc subdirectory of the cpython sources is targetting (b) when they're writing the code, but not necessarily when tuning it, and (c) doesn't seem to get covered.

So perhaps there should be a new top-level section of documentation within the main Python Doc directory covering instrumentation and performance?  (e.g. "instrumentation" or "debugging"). If so, these docs could be an entry within that section.  Though I don't want to block this feature on writing up a full guide to debugging/deploying Python :)

Less ambitiously, we could just add it to the "howto" section (which appears to be something of a dumping ground for docs that didn't fit neatly into the other categories).  Would that be acceptable in lieu of a better location?

Or is this something for the python-dev mailing list?
Cheers!
msg161013 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-17 20:17
I think it could go into "python setup and usage" (http://docs.python.org/dev/using/index.html), but that seems target at first-time users; or perhaps a HOWTO (http://docs.python.org/dev/howto/index.html), but it might make it less visible. Or a "debug and instrumentation" section indeed (we could also put the python-gdb doc there).

Perhaps you could ask python-dev indeed.
msg161034 - (view) Author: Éric Araujo (merwok) * (Python committer) Date: 2012-05-18 04:49
> We currently have several locations for documentation:
Don’t forget the wiki <wink>.  It is much less curated than the other docs.

> Though I don't want to block this feature on writing up a full guide to debugging
I started work on a debugging howto: #12913.  Maybe your patch could create a file in Doc/howto/debugging.rst to be filled later by my patch?

> Less ambitiously, we could just add it to the "howto" section (which appears to be something of
> a dumping ground for docs that didn't fit neatly into the other categories).
On the contrary, the howtos are user-focused documents that forsake the dry reference style of the main docs and provide much needed examples and walkthroughs on some topics.
msg163357 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2012-06-21 18:22
I'm attaching an updated version of the patch; I hope there's still time before Saturday to get it into 3.3

I found a bug in the configure script in the old patch: /usr/bin/dtrace was required, even without requesting systemtap.  I've fixed it by adding a new "SYSTEMTAPDEPS" to the configure.a/Makefile.pre.in  A minor subtlety: within the configure.ac's --with-systemtap branch it is set up like this:
   SYSTEMTAPDEPS="\$(srcdir)/Python/pysystemtap.h"
where the leading backslash is needed so that $(srcdir) doesn't get interpreted by the shell when running configure as "run the command named 'srcdir'".

I also added removal of $(srcdir)/Python/pysystemtap.h to the "clean" target

I made a slight change to the static markers themselves: they now pass a 4th argument, the PyFrameObject*, since it's possible to make use of this to inspect locals etc from systemtap, which might be of use to people.  This doesn't introduce any further complexity to the ceval.c code.

I moved the documentation from the devguide back to the cpython source tree, rewrote it as a HOWTO (Doc/howto/instrumentation.rst), adding some extra material (e.g. about tapsets).   I also added a NEWS item.

Tested on a Fedora 15 box with these configurations:
  * --with-pydebug (implicit --without-systemtap)
  * explicitly --without-systemtap
  * --with-pydebug --with-systemtap
  * --with-pydebug --enable-shared --with-systemtap
  * --enable-shared --with-systemtap
(note that because of issue 14774 I had to rebuild _sysconfigdata.py each time, or the tests fail due to being confused about whether we were configured with --enable-shared)

How is this looking?
msg191560 - (view) Author: Robert Buchholz (Robert.Buchholz) Date: 2013-06-21 09:24
It's been a year and the 3.4 alpha is approaching. What's the status of upstream inclusion of this patch?
msg191561 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-06-21 09:27
Note that feature freeze happens after the first /beta/, so the alpha release isn't a showstopper.
msg218788 - (view) Author: Bohuslav "Slavek" Kabrda (bkabrda) * Date: 2014-05-19 11:10
Hi,
I'd like to take this over after Dave Malcolm. I don't see any issues that haven't been resolved, so my question is: What else can I do to make this patch acceptable?
I'm attaching a rebased version of this patch that applies to current default branch (FWIW, we use this patch downstream in Fedora in our python3-debug build and it works well)
msg218793 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2014-05-19 14:15
On Mon, 2014-05-19 at 11:10 +0000, Bohuslav "Slavek" Kabrda wrote:
> Bohuslav "Slavek" Kabrda added the comment:
> 
> Hi,
> I'd like to take this over after Dave Malcolm. I don't see any issues that haven't been resolved, so my question is: What else can I do to make this patch acceptable?
> I'm attaching a rebased version of this patch that applies to current default branch (FWIW, we use this patch downstream in Fedora in our python3-debug build and it works well)
> 
> ----------
> Added file: http://bugs.python.org/file35293/cpython-systemtap-2014-05-19.patch

Is this attachment missing the "instrumentation.rst"?

There are also a couple of example scripts we ship in the RPMs, iirc.
msg218794 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2014-05-19 14:25
On Mon, 2014-05-19 at 14:15 +0000, Dave Malcolm wrote:
> Dave Malcolm added the comment:
> 
> On Mon, 2014-05-19 at 11:10 +0000, Bohuslav "Slavek" Kabrda wrote:
> > Bohuslav "Slavek" Kabrda added the comment:
> > 
> > Hi,
> > I'd like to take this over after Dave Malcolm. I don't see any issues that haven't been resolved, so my question is: What else can I do to make this patch acceptable?
> > I'm attaching a rebased version of this patch that applies to current default branch (FWIW, we use this patch downstream in Fedora in our python3-debug build and it works well)
> > 
> > ----------
> > Added file: http://bugs.python.org/file35293/cpython-systemtap-2014-05-19.patch
> 
> Is this attachment missing the "instrumentation.rst"?

FWIW I see it within
http://bugs.python.org/file26074/cpython-systemtap-2012-06-21-001.patch

> There are also a couple of example scripts we ship in the RPMs, iirc.
msg218796 - (view) Author: Bohuslav "Slavek" Kabrda (bkabrda) * Date: 2014-05-19 15:05
As usual, I forgot to hg add the new files before running hg diff, so the newly created files didn't get added to the patch. Attaching a fixed version that hopefully has everything.
AFAICS all the scripts that Fedora has are 1:1 copy of documentation in instrumentation.rst, so I don't think it's necessary to add them here (there are enough files attached here already ;)).
msg218815 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2014-05-19 20:30
I would like to point out that if this patch gets accepted, maybe issue #13405 (updated, I keep an up to date version in my mercurial repo) should too.
msg218817 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2014-05-19 20:49
I still wish there was a patch that combined both dtrace and systemtap support, und used as much code sharing between them as feasible. I'd be +1 on such a patch, and -0 on two separate patches that achieve the same functionality, but on different code paths.

For example, the systemtap version has a helper function get_frame_marker_info that covers more cases than the dtrace version; OTOH, the dtrace version has more trace points.

If consensus on functionality is not easily achieved, I propose to have the intersection on functionality first, i.e. only use the function-entry/exit trace points even in the dtrace version. Or else you agree on what trace points both systems ought to provide.
msg218864 - (view) Author: Bohuslav "Slavek" Kabrda (bkabrda) * Date: 2014-05-21 07:47
I'd be fine with adapting the patch to support both systemtap and dtrace, however I have very little knowledge of dtrace and I don't have a machine to test it on.
@jcea would you be willing to work on such patch with me? I'm sure we could work on this together and reuse each others' code as much as possible.
msg218871 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2014-05-21 12:31
I would. Just to point out that some dtrace weirdness like the stack walker is needed because dtrace probes are executed inside the OS kernel :-).

Can we analyze case by case?.
msg219214 - (view) Author: Bohuslav "Slavek" Kabrda (bkabrda) * Date: 2014-05-27 10:26
@jcea: So here is my proposal for dealing with this: let's take what I currently have (e.g. tracepoints for function entry/function exit) and extend my patch to also work with DTrace in a similar fashion. Then, when we have a working patch for both Systemtap and DTrace, we can start adding more tracepoints as needed/as we see fit. Sounds good?
msg219215 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2014-05-27 12:09
Could you possibly create a new issue and add me to its NOSY?
msg219220 - (view) Author: Bohuslav "Slavek" Kabrda (bkabrda) * Date: 2014-05-27 13:07
I opened a bug for tracking progress of development of the combined dtrace and systemtap support: http://bugs.python.org/issue21590
History
Date User Action Args
2016-09-10 02:15:33lukasz.langasetstatus: open -> closed
superseder: Systemtap and DTrace support
resolution: out of date
versions: + Python 3.6, - Python 3.5
2015-09-02 19:19:36cburroughssetnosy: + cburroughs
2014-12-27 04:41:11beneschsetnosy: + benesch
2014-06-26 08:52:09xmorelsetnosy: + xmorel
2014-05-27 13:07:23bkabrdasetmessages: + msg219220
2014-05-27 12:09:31jceasetmessages: + msg219215
2014-05-27 10:26:33bkabrdasetmessages: + msg219214
2014-05-21 12:31:09jceasetmessages: + msg218871
2014-05-21 07:47:08bkabrdasetmessages: + msg218864
2014-05-19 20:49:58loewissetmessages: + msg218817
2014-05-19 20:30:37jceasetmessages: + msg218815
2014-05-19 15:16:58terry.reedysetversions: + Python 3.5, - Python 3.4
2014-05-19 15:05:15bkabrdasetfiles: + cpython-systemtap-2014-05-19-all-files.patch

messages: + msg218796
2014-05-19 14:25:13dmalcolmsetmessages: + msg218794
2014-05-19 14:15:01dmalcolmsetmessages: + msg218793
2014-05-19 11:10:26bkabrdasetfiles: + cpython-systemtap-2014-05-19.patch

messages: + msg218788
2013-06-21 11:01:34christian.heimessetnosy: + christian.heimes

versions: + Python 3.4, - Python 3.3
2013-06-21 09:27:46pitrousetmessages: + msg191561
2013-06-21 09:24:36Robert.Buchholzsetmessages: + msg191560
2013-06-21 09:23:11Robert.Buchholzsetnosy: + Robert.Buchholz
2012-06-21 18:23:03dmalcolmsetfiles: + cpython-systemtap-2012-06-21-001.patch

messages: + msg163357
2012-05-18 04:49:13merwoksetmessages: + msg161034
2012-05-17 20:17:40pitrousetmessages: + msg161013
2012-05-17 20:13:28dmalcolmsetmessages: + msg161012
2012-05-16 18:59:00dmalcolmsetfiles: - cpython-systemtap-2012-05-16-001.patch
2012-05-16 18:57:07dmalcolmsetmessages: - msg160914
2012-05-16 18:56:56dmalcolmsetmessages: - msg160913
2012-05-16 18:48:54dmalcolmsetmessages: + msg160914
2012-05-16 18:46:40dmalcolmsetfiles: + cpython-systemtap-2012-05-16-001.patch

messages: + msg160913
2012-05-16 18:38:37pitrousetmessages: + msg160911
2012-05-16 18:35:24fchesetmessages: + msg160909
2012-05-16 18:29:08pitrousetmessages: + msg160907
2012-05-16 18:15:14dmalcolmsetfiles: + cpython-systemtap-2012-05-16-001.patch

messages: + msg160905
2012-05-16 16:41:34pitrousetmessages: + msg160883
2012-05-16 16:22:36dmalcolmsetnosy: + fche, scox
2012-05-14 05:58:37bkabrdasetnosy: + bkabrda
2012-05-11 20:50:26dmalcolmsetfiles: + devguide-systemtap-2012-05-11-001.patch
2012-05-11 20:49:54dmalcolmsetfiles: + cpython-systemtap-2012-05-11-001.patch

messages: + msg160448
2012-05-11 15:02:35mjwsetmessages: + msg160414
2012-05-11 14:55:00mjwsetnosy: + mjw
2012-05-11 04:53:31pitrousetnosy: + loewis, pitrou, jcea
messages: + msg160400
2012-05-10 23:51:17merwoksetmessages: + msg160389
2012-05-10 23:49:06merwoksetnosy: + merwok
messages: + msg160388
2012-05-10 21:05:15dmalcolmcreate