This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: test_gdb: use utf8+surrogateescape charset?
Type: Stage:
Components: Tests Versions: Python 3.2
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: dmalcolm, loewis, vstinner
Priority: normal Keywords: patch

Created on 2010-04-22 00:43 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit
gdb_bug.txt vstinner, 2010-04-22 00:43
test_gdb_surrogates.patch vstinner, 2010-04-22 00:44
Messages (4)
msg103929 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-22 00:43
Because of a strange bug, gdb writes random bytes to stdout. test_gdb decodes output as utf8, but these random bytes cause a UnicodeDecodeError:

ERROR: test_int (__main__.PrettyPrintTests)
Verify the pretty-printing of various "int"/long values
Traceback (most recent call last):
  File "Lib/test/", line 188, in test_int
  File "Lib/test/", line 176, in assertGdbRepr
  File "Lib/test/", line 144, in get_gdb_repr
  File "Lib/test/", line 120, in get_stack_trace
    out, err = self.run_gdb(*args)
  File "Lib/test/", line 62, in run_gdb
    return out.decode('utf-8'), err.decode('utf-8')
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1882-1887: unsupported Unicode code range

surrogateescape should be used the invalid sequence using surrogates.


See attached file for the strange gdb bug.

command is the byte string "id(1000000000000)\n\0" (19 bytes, strlen=18), but gdb prints bytes after the \0. Stranger: print (*command)@15 does also prints these random bytes, whereas print (*command)@14 doesn't.
msg103930 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-22 00:45
py3k tests pass on Debian Sid (gdb 7.1) without the patch, and pass on Ubuntu 9.10 (gdb 7.0) with the patch.
msg103938 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-22 05:20
I think the "replace" handler would be more appropriate here.
msg104048 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-23 20:35
@loewis: I think the "replace" handler would be more appropriate here.

Right. Fixed by r80416 (py3k, blocked in 3.1: r80417).
Date User Action Args
2022-04-11 14:57:00adminsetgithub: 52741
2010-04-23 20:35:38vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg104048
2010-04-22 05:20:08loewissetmessages: + msg103938
2010-04-22 00:45:52vstinnersetnosy: + loewis, dmalcolm
messages: + msg103930
2010-04-22 00:44:06vstinnersetfiles: + test_gdb_surrogates.patch
keywords: + patch
2010-04-22 00:43:06vstinnercreate