This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Port of the gdb7 debugging hooks to the "py3k" branch
Type: Stage: patch review
Components: Demos and Tools Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: loewis Nosy List: dmalcolm, loewis, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2010-04-12 22:09 by dmalcolm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
port-gdb7-hooks-to-py3k.patch dmalcolm, 2010-04-12 22:09 Patch against py3k branch (r80008) to port the gdb7 hooks to Python 3
port-gdb7-hooks-to-py3k-002.patch dmalcolm, 2010-04-21 21:29
Messages (8)
msg102980 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-12 22:09
I'm attaching a patch for the py3k branch to port the gdb hooks to Python 3.

The libpython.py code installed to python-gdb.py "knows" about the internal details of the Python within the tree.  This patch makes the necessary changes to that code for the internals of Python 3, rather than Python 2.

Note that libpython.py is intended to run inside a gdb linked against libpython2, and so libpython.py is still Python 2 code; however I've updated it to expect the so-called "inferior process" to be Python 3:
* Py_TPFLAGS_STRING_SUBCLASS becomes Py_TPFLAGS_BYTES_SUBCLASS
* PyStringObjectPtr becomes PyBytesObjectPtr
* change PyObjectPtr to correctly locate "ob_size": with Python 2 variable-sized subclasses we could simply look it up as a field of the subclass struct, but for Python 3 the field may be in an ob_base member, or in an ob_base.ob_base member.  We have to cast to a PyVarObject* to find it
* PyIntObject went away; PyBoolObject is now a subclass of PyLongObject
* writing out frames needed a slight rewrite with the change from co_filename and co_name from PyStringObject* to PyUnicodeObject*

This makes the "proxy values" concept a bit awkward; for example a "str" in the inferior Python 3 process looks like a "unicode" to the gdb Python 2 process.  This and the int->long change required a lot of minor updating to expected values in the selftests.

The test_gdb.py and gdb_sample.py code _are_ for Python 3.  I've assumed that all output is in UTF-8 for now.

For Python 2, I was testing the code by putting a breakpoint on PyObject_Print, and printing objects, as a convenient hook for scraping gdb's stdout.

PyObject_Print still exists in Python 3, but isn't used by the "print" implementation, so I needed a handy function that I could put a breakpoint on, and invoke from the Python side: I looked for something with METH_O that isn't called by site.py and doesn't require an import.  I chose the "id" builtin, which corresponds to Python/ceval.c:builtin_id ()

Some minor 2to3-style fixes were also needed in the test code

All of the selftests are currently commented out to keep the buildbot clean (apparently from a merge from trunk), and I've kept them commented out.  They all pass on my machine with:
  make -j4 ; ./python -Wd ./Lib/test/regrtest.py -v test_gdb

This also contains a port of the (partial?) fix for issue 8330 from trunk's r79986 that doesn't seem to have been merged to py3k (it needed a fair amount of rewriting).
msg102982 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-12 22:14
Should we call it libpython3.py, in order to distinguish it from the 2.x version?
msg102986 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-12 22:44
> Should we call it libpython3.py, in order to distinguish it from the 2.x
> version?

We could; it gets copied to python-gdb.py by the Makefile though.

The code is intended to track the low-level implementation details of the tree that its in, so I'd expect the python 2 and python 3 versions to drift apart over time, FWIW
msg103855 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-21 15:27
Looking at issue 8480, it looks like this a partial fix was applied, which will mean this patch will no longer apply.  Should I regenerate a patch against what's now in SVN, or should we use my patch?
msg103858 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-21 15:35
> Looking at issue 8480, it looks like this a partial fix was applied,

Martin fixed calls to assertListing(). I renamed PyStringObjectStr to PyBytesStringPtr and used a breakpoint on textiowrapper_write() instead of PyObject_Dump(). Your patch uses a breakpoint on builtin_id(): I don't know what's best, I only tried to fix tests.

Sorry, I didn't knew this issue.

> which will mean this patch will no longer apply.  Should I regenerate
> a patch against what's now in SVN, or should we use my patch?

I think that I would be easier to update your patch to SVN (py3k).

--

I don't understand why the tests contain long type "L" suffix and unicode "u" prefix (eg. "1L"). Is it because gdb is linked to Python2?
msg103859 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-21 15:53
Thanks; I'm working on a newer version of the patch based on what's in SVN.  

I prefer your choice of breakpoint, and I've changed my mind about the python2 vs python3 proxyval handling.  Hope to have a fresh patch later today.
msg103913 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-04-21 21:29
I'm attaching a new version of the patch, for the py3k branch.

I changed my mind back about the breakpoint, using "id" and "builtin_id" as in my original patch.  I prefer it since it has a single argument, which makes it very convenient to work with in the various tests - textiowrapper_write takes an args tuple, which makes things like corrupting the pointer slightly more tricky.

The big change here is that I've changed the output format throughout to try to emulate Python 3 literals: a PyLongObject instance is now printed as digits, without a trailing "L".  I feel that the fact that gdb is running python 2 is really just an implementation detail, and that the pretty-printer ought to print in a format reflecting the language being debugged.

This also removes the 'u' prefix from strings, and I've added tests for 'bytes' (which get a "b" prefix).  I've also (I believe) correctly implemented the Python 3's literal representation for empty and non-empty sets and frozensets ( e.g. "{1, 2, 3}", as opposed to Python 2's "set([1, 2, 3])" )

More controversially, a PyUnicodeObject instance is printed using an emulation of Python 3's unicode_repr algorithm, which means that gdb prints unicode to sys.stdout, so that gdb will potentially print non-ASCII characters, using the encoding of sys.stdout.  This will only work if gdb's encoding is set to something that can cope with said characters:

Python 3.2a0 (py3k:80312M, Apr 21 2010, 17:00:02) 
[GCC 4.4.3 20100127 (Red Hat 4.4.3-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> id('文字化け')

Breakpoint 1, builtin_id (self=<module at remote 0x7ffff7fd7df8>, v='文字化け') at Python/bltinmodule.c:912
912		return PyLong_FromVoidPtr(v);

Note the unicode characters in the rendering of "v" in the breakpoint.

I suspect that this is a change too far (for example, I'm assuming a UTF-8 locale).

Any suggestions on what the output should look like for the unicode case?  

Would it be better if I coerce everything back to an escaped literal syntax that's encodable as ASCII?  That would probably avoid encoding and locale issues, but lose immediate readability for people able to read non-ASCII scripts.

All tests pass with both UCS2 and UCS4 builds on this Fedora 12 x86_64 box, building with --with-pydebug in both cases.
msg103918 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-21 22:40
Thanks for the patch; applied as r80324. As it is an improvement over the status quo, I've applied it; I'll also be closing this issue.

I still get test failures which I report as a separate bug report.

As for displaying strings: I think it should do what gdb normally does in such cases, although I didn't investigate what that actually is.
History
Date User Action Args
2022-04-11 14:56:59adminsetgithub: 52627
2010-04-21 22:40:46loewissetstatus: open -> closed
resolution: accepted
messages: + msg103918
2010-04-21 21:29:09dmalcolmsetfiles: + port-gdb7-hooks-to-py3k-002.patch

messages: + msg103913
2010-04-21 15:53:54dmalcolmsetmessages: + msg103859
2010-04-21 15:35:30vstinnersetmessages: + msg103858
2010-04-21 15:28:31dmalcolmsetnosy: + vstinner
2010-04-21 15:27:39dmalcolmsetmessages: + msg103855
2010-04-21 15:23:09dmalcolmlinkissue8479 superseder
2010-04-12 22:44:18dmalcolmsetmessages: + msg102986
2010-04-12 22:14:59pitrousetnosy: + pitrou
messages: + msg102982
2010-04-12 22:09:31dmalcolmcreate