This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author dmalcolm
Recipients dmalcolm, pitrou
Date 2010-07-07.18:23:58
SpamBayes Score 0.00083928485
Marked as misclassified No
Message-id <1278527044.12.0.754366993337.issue9188@psf.upfronthosting.co.za>
In-reply-to
Content
The traceback is
Traceback (most recent call last):
  File "/home/antoine/cpython/27/python-gdb.py", line 1084, in to_string
    return pyop.get_truncated_repr(MAX_OUTPUT_LEN)
  File "/home/antoine/cpython/27/python-gdb.py", line 183, in get_truncated_repr
    self.write_repr(out, set())
  File "/home/antoine/cpython/27/python-gdb.py", line 1054, in write_repr
    proxy2.append(unichr(code))
ValueError: unichr() arg not in range(0x10000) (narrow Python build)

and occurs within the gdb process whilst trying to pretty-print a PyUnicodeObject whilst running "print u'\\U0001d121'\n"

It looks like the gdb you're using has been built against a python built with narrow unicode, and the python being debugged is also built with narrow unicode.

The code in question was introduced in r81377 and replaces surrogate pairs with their UCS4 equivalent, converting UCS2 characters from the inferior process into UCS4 characters for use in the gdb process.  It appears to assume that the gdb executable was linked against a UCS4-build of python.

The attached patch (against release27-maint) tries to turn off this joining of surrogates, except in the case where inferior is UCS2 and gdb is UCS4.

Tested successfully with:
 - inferior:UCS2  gdb:UCS4
 - inferior:UCS4  gdb:UCS4

Not yet tested with gdb UCS2

Antoine, does the above sound sane?  Can you test this on your UCS2 gdb please?
History
Date User Action Args
2010-07-07 18:24:04dmalcolmsetrecipients: + dmalcolm, pitrou
2010-07-07 18:24:04dmalcolmsetmessageid: <1278527044.12.0.754366993337.issue9188@psf.upfronthosting.co.za>
2010-07-07 18:24:01dmalcolmlinkissue9188 messages
2010-07-07 18:23:59dmalcolmcreate