classification
Title: test_gdb fails in s390x SLES buildbots
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: miss-islington, pablogsal, vstinner, xtreak
Priority: normal Keywords: patch

Created on 2018-06-30 00:24 by pablogsal, last changed 2018-09-07 22:10 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 9018 merged pablogsal, 2018-08-31 21:16
PR 9019 closed miss-islington, 2018-08-31 22:21
PR 9020 merged miss-islington, 2018-08-31 22:21
PR 9021 merged miss-islington, 2018-08-31 22:21
PR 9022 merged pablogsal, 2018-08-31 22:53
Messages (14)
msg320747 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-06-30 00:24
test_gdb is currently failing on the following buildbots:

http://buildbot.python.org/all/#/builders/54/builds/465
http://buildbot.python.org/all/#/builders/66/builds/178
http://buildbot.python.org/all/#/builders/16/builds/1279
http://buildbot.python.org/all/#/builders/122/builds/481

Sample error lines:


======================================================================
FAIL: test_threads (test.test_gdb.PyBtTests)
Verify that "py-bt" indicates threads that are waiting for the GIL
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.7.edelsohn-sles-z/build/Lib/test/test_gdb.py", line 776, in test_threads
    cmds_after_breakpoint=['thread apply all py-bt'])
  File "/home/dje/cpython-buildarea/3.7.edelsohn-sles-z/build/Lib/test/test_gdb.py", line 213, in get_stack_trace
    self.assertEqual(unexpected_errlines, [])
AssertionError: Lists differ: ["Python Exception <class 'gdb.error'> PC [58 chars]ved'] != []
First list contains 2 additional elements.
First extra element 0:
"Python Exception <class 'gdb.error'> PC not saved: "
+ []
- ["Python Exception <class 'gdb.error'> PC not saved: ",
-  'Error occurred in Python command: PC not saved']
----------------------------------------------------------------------
Ran 46 tests in 19.888s
FAILED (failures=1)
1 test failed again:
    test_gdb
msg320766 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-30 10:10
In gdb source code, frame_unwind_pc(), I see:

else if (this_frame->prev_pc.status == CC_NOT_SAVED)
    throw_error (OPTIMIZED_OUT_ERROR, _("PC not saved"));
msg321158 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-07-06 08:24
All the builds fail consistently after http://buildbot.python.org/all/#/builders/54/builds/465 and things were passing before 465. Maybe is this something to do with the commit that happened there that we could revert and test it in the machine by doing a git-bisect? But I couldn't see anything on GDB related code being changed in Python on #465. My initial hunch was that it had something to do with GDB 8.0+ but Ubuntu 18.04 LTS also runs on 8.0+ fine.

Tests pass : 00e05242ae650bc29e8052a5fb79cbb83224a657
Tests fail : d6a283b37b66ecd2d0ff43602ddc8e91b54a51c5

➜  cpython git:(master) git log 00e05242ae650bc29e8052a5fb79cbb83224a657~1..d6a283b37b66ecd2d0ff43602ddc8e91b54a51c5

commit d6a283b37b66ecd2d0ff43602ddc8e91b54a51c5
Author: Zackery Spytz <zspytz@gmail.com>
Date:   Fri Jun 29 13:30:07 2018 -0600

    [3.6] bpo-25862: Fix assertion failures in io.TextIOWrapper.tell(). (GH-3918). (GH-8012)

    (cherry picked from commit 23db935bcf258657682e66464bf8512def8af830)

    Co-authored-by: Zackery Spytz <zspytz@gmail.com>

commit a73027918a659ca69480a7e023d3838e2601ce6d
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Fri Jun 29 03:17:43 2018 -0700

    Fix the indentation in the documentation of bin() and hash() (GH-7998)

    (cherry picked from commit bda9c3eae3ad16e43145599ac6359bfdcaf1fd4a)

    Co-authored-by: Andrés Delfino <adelfino@gmail.com>

commit 00e05242ae650bc29e8052a5fb79cbb83224a657
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Thu Jun 28 22:32:16 2018 -0700

    bpo-14117: Make minor tweaks to turtledemo (GH-8002)


    The 'wikipedia' example is now 'rosette', describing what it draws.
    The 'penrose' print output is reduced.  The 'tree' '1024'
    output is eliminated.
    (cherry picked from commit 891a1f86d415779cf67ca23e626a868e586feb05)

    Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
msg322805 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-07-31 18:08
New failure on s390x SLES 3.7: 

https://buildbot.python.org/all/#/builders/122/builds/540
msg323808 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-20 19:46
This failure seems related to the issue:

https://buildbot.python.org/all/#/builders/139/builds/74

=====================================================================
FAIL: test_threads (test.test_gdb.PyBtTests)
Verify that "py-bt" indicates threads that are waiting for the GIL
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/srv/buildbot/buildarea/2.7.bolen-ubuntu/build/Lib/test/test_gdb.py", line 808, in test_threads
    self.assertIn('Waiting for the GIL', gdb_output)
test test_gdb failed -- Traceback (most recent call last):
  File "/srv/buildbot/buildarea/2.7.bolen-ubuntu/build/Lib/test/test_gdb.py", line 808, in test_threads
    self.assertIn('Waiting for the GIL', gdb_output)
AssertionError: 'Waiting for the GIL' not found in 'Breakpoint 1 (PyObject_Print) pending.\n.....
msg324194 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-27 20:21
I requested a SLES VM in https://linuxone20.cloud.marist.edu as indicated by @David Edelsohn (the builtbot maintainer) and I can reproduce the issue on master, 3.7, 3.6 and 3.5. So it seems that a change in gdb itself caused this. Interestingly, in Python 3.5 the error is different:

======================================================================
FAIL: test_threads (test.test_gdb.PyBtTests)
Verify that "py-bt" indicates threads that are waiting for the GIL
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/linux1/cpython/Lib/test/test_gdb.py", line 771, in test_threads
    self.assertIn('Waiting for the GIL', gdb_output)
AssertionError: 'Waiting for the GIL' not found in 'Breakpoint 1 at 0x11751d8: file Python/bltinmodule.c, line 1093.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/lib64/libthread_db.so.1".\n[New Thread 0x3fffd2ff910 (LWP 33928)]\n[New Thread 0x3fffcaff910 (LWP 33929)]\n'

----------------------------------------------------------------------
Ran 45 tests in 19.522s
msg324200 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-27 21:48
I think I understand what is happening. The SLES buildbots do not have debugging symbols for glibc:

(gdb) set verbose on
(gdb) break builtin_id
Reading in symbols for Python/bltinmodule.c...done.
Breakpoint 1 at 0x139403c: file Python/bltinmodule.c, line 1182.
(gdb) r
Starting program: /home/linux1/cpython/python tester.py
Reading symbols from /lib/ld64.so.1...(no debugging symbols found)...done.
Reading symbols from system-supplied DSO at 0x3fffdffe000...(no debugging symbols found)...done.
Missing separate debuginfos, use: zypper install glibc-debuginfo-2.22-62.13.2.s390x
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib64/libutil.so.1...(no debugging symbols found)...done.
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.

And when requesting the stacktrace of a thread it cannot go into libc because the program counter is not available:

(gdb) thread apply 2 where

#19 0x0000000001324f02 in method_call (method=<method at remote 0x3fffdeee258>, args=(), kwargs=0x0) at Objects/classobject.c:306
#20 0x0000000001047396 in PyObject_Call (callable=<method at remote 0x3fffdeee258>, args=(), kwargs=0x0) at Objects/call.c:245
#21 0x00000000012b91be in t_bootstrap (boot_raw=0x3fffd9fb118) at ./Modules/_threadmodule.c:992
#22 0x000003fffde88b22 in start_thread () from /lib64/libpthread.so.0
#23 0x000003fffdb7164a in thread_start () from /lib64/libc.so.6
PC not saved

This get triggered when printing the python stack trace with py-bt and the exception is getting printed on stderr and this fails the test.
msg324204 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-27 23:03
I need to do some more experimenting because those symbols are not available on other systems with the test passing. Something is corrupting/not saving the program counter but I do not know what.
msg324205 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-27 23:08
Walking the stack up one by one only triggers the "PC not saved" when the stack goes into libc:

(gdb) break builtin_id
Breakpoint 1 at 0x139403c: file Python/bltinmodule.c, line 1182.
(gdb) r
Starting program: /home/linux1/cpython/python tester.py
Missing separate debuginfos, use: zypper install glibc-debuginfo-2.22-62.13.2.s390x
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x3fffd2ff910 (LWP 9951)]
[New Thread 0x3fffcaff910 (LWP 9952)]
[New Thread 0x3fff7fff910 (LWP 9953)]
[New Thread 0x3fff77ff910 (LWP 9954)]

Thread 1 "python" hit Breakpoint 1, builtin_id (self=0x3fffdf5c0c8, v=0x14f7690 <small_ints+2256>) at Python/bltinmodule.c:1182
1182        return PyLong_FromVoidPtr(v);
(gdb) thread 2
[Switching to thread 2 (Thread 0x3fffd2ff910 (LWP 9951))]
#0  0x000003fffde8da0a in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) up
#1  0x0000000001186a2e in PyCOND_TIMEDWAIT (cond=0x15139b0 <_PyRuntime+1192>, mut=0x15139e0 <_PyRuntime+1240>, us=5000) at Python/condvar.h:90
90          r = pthread_cond_timedwait((cond), (mut), &ts);
(gdb)
#2  0x0000000001186f3a in take_gil (tstate=0x1538620) at Python/ceval_gil.h:208
208             COND_TIMED_WAIT(_PyRuntime.ceval.gil.cond, _PyRuntime.ceval.gil.mutex,
(gdb)
#3  0x00000000011884a6 in _PyEval_EvalFrameDefault (f=0x3fffd844d00, throwflag=0) at Python/ceval.c:980
980                     take_gil(tstate);
(gdb)
#4  0x0000000001187db8 in PyEval_EvalFrameEx (f=0x3fffd844d00, throwflag=0) at Python/ceval.c:536
536         return interp->eval_frame(f, throwflag);
(gdb)
#5  0x0000000001047596 in function_code_fastcall (co=0x3fffdd76640, args=0x3fff8000a98, nargs=1, globals=0x3fffddcff30) at Objects/call.c:283
283         result = PyEval_EvalFrameEx(f,0);
(gdb)
#6  0x0000000001048144 in _PyFunction_FastCallKeywords (func=0x3fffd7d8680, stack=0x3fff8000a90, nargs=1, kwnames=0x0) at Objects/call.c:408
408                 return function_code_fastcall(co, stack, nargs, globals);
(gdb)
#7  0x00000000011a48b6 in call_function (pp_stack=0x3fffd2fb5f8, oparg=1, kwnames=0x0) at Python/ceval.c:4623
4623                x = _PyFunction_FastCallKeywords(func, stack, nargs, kwnames);
(gdb)
#8  0x000000000119c088 in _PyEval_EvalFrameDefault (f=0x3fff80008f0, throwflag=0) at Python/ceval.c:3186
3186                    res = call_function(&sp, oparg + 1, NULL);
(gdb)
#9  0x0000000001187db8 in PyEval_EvalFrameEx (f=0x3fff80008f0, throwflag=0) at Python/ceval.c:536
536         return interp->eval_frame(f, throwflag);
(gdb)
#10 0x0000000001047596 in function_code_fastcall (co=0x3fffda6e580, args=0x3fffd87e980, nargs=1, globals=0x3fffdd717c0) at Objects/call.c:283
283         result = PyEval_EvalFrameEx(f,0);
(gdb)
#11 0x0000000001048144 in _PyFunction_FastCallKeywords (func=0x3fffd7d61b0, stack=0x3fffd87e978, nargs=1, kwnames=0x0) at Objects/call.c:408
408                 return function_code_fastcall(co, stack, nargs, globals);
(gdb)
#12 0x00000000011a48b6 in call_function (pp_stack=0x3fffd2fd1b8, oparg=1, kwnames=0x0) at Python/ceval.c:4623
4623                x = _PyFunction_FastCallKeywords(func, stack, nargs, kwnames);
(gdb)
#13 0x000000000119c088 in _PyEval_EvalFrameDefault (f=0x3fffd87e7f0, throwflag=0) at Python/ceval.c:3186
3186                    res = call_function(&sp, oparg + 1, NULL);
(gdb)
#14 0x0000000001187db8 in PyEval_EvalFrameEx (f=0x3fffd87e7f0, throwflag=0) at Python/ceval.c:536
536         return interp->eval_frame(f, throwflag);
(gdb)
#15 0x0000000001047596 in function_code_fastcall (co=0x3fffda6e340, args=0x3fffd2fed28, nargs=1, globals=0x3fffdd717c0) at Objects/call.c:283
283         result = PyEval_EvalFrameEx(f,0);
(gdb)
#16 0x0000000001047a2e in _PyFunction_FastCallDict (func=0x3fffd7d3f70, args=0x3fffd2fed20, nargs=1, kwargs=0x0) at Objects/call.c:322
322                 return function_code_fastcall(co, args, nargs, globals);
(gdb)
#17 0x00000000010467d6 in _PyObject_FastCallDict (callable=0x3fffd7d3f70, args=0x3fffd2fed20, nargs=1, kwargs=0x0) at Objects/call.c:98
98              return _PyFunction_FastCallDict(callable, args, nargs, kwargs);
(gdb)
#18 0x0000000001049c5c in _PyObject_Call_Prepend (callable=0x3fffd7d3f70, obj=0x3fffdc70650, args=0x3fffdfb9050, kwargs=0x0)
    at Objects/call.c:904
904         result = _PyObject_FastCallDict(callable,
(gdb)
#19 0x0000000001324f02 in method_call (method=0x3fffdeee258, args=0x3fffdfb9050, kwargs=0x0) at Objects/classobject.c:306
306         return _PyObject_Call_Prepend(func, self, args, kwargs);
(gdb)
#20 0x0000000001047396 in PyObject_Call (callable=0x3fffdeee258, args=0x3fffdfb9050, kwargs=0x0) at Objects/call.c:245
245             result = (*call)(callable, args, kwargs);
(gdb)
#21 0x00000000012b91be in t_bootstrap (boot_raw=0x3fffd9fb118) at ./Modules/_threadmodule.c:992
992         res = PyObject_Call(boot->func, boot->args, boot->keyw);
(gdb)
#22 0x000003fffde88b22 in start_thread () from /lib64/libpthread.so.0
(gdb)
#23 0x000003fffdb7164a in thread_start () from /lib64/libc.so.6
(gdb)
PC not saved
msg324350 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-29 21:22
Interestingly, everything points to something special in the glibc version of the buildbot. Usually the end of the stack for threads is:

(gdb)
#22 0x00007ffff7f82a9d in start_thread () from /usr/lib/libpthread.so.0
(gdb)
#23 0x00007ffff7d23a43 in clone () from /usr/lib/libc.so.6

but in this buildbot we get:

#24 0x000003fffde88b22 in start_thread () from /lib64/libpthread.so.0
(gdb)
#25 0x000003fffdb7164a in thread_start () from /lib64/libc.so.6
(gdb)
PC not saved

I think that the combination of glibc + gdb is producing this problem probably because the optimization level of glibc.
msg324446 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-08-31 22:04
New changeset f2ef51f8bec525b21e52988880c8a029642795ed by Pablo Galindo in branch 'master':
bpo-34007: Skip traceback tests if the Program Counter is not available. (GH-9018)
https://github.com/python/cpython/commit/f2ef51f8bec525b21e52988880c8a029642795ed
msg324450 - (view) Author: miss-islington (miss-islington) Date: 2018-08-31 22:43
New changeset 5d594f3106aff6cea00234c88051427ae511cdd8 by Miss Islington (bot) in branch '2.7':
bpo-34007: Skip traceback tests if the Program Counter is not available. (GH-9018)
https://github.com/python/cpython/commit/5d594f3106aff6cea00234c88051427ae511cdd8
msg324451 - (view) Author: miss-islington (miss-islington) Date: 2018-08-31 22:47
New changeset 4da71814b327cb2ad47a01710360cd21ba636352 by Miss Islington (bot) in branch '3.7':
bpo-34007: Skip traceback tests if the Program Counter is not available. (GH-9018)
https://github.com/python/cpython/commit/4da71814b327cb2ad47a01710360cd21ba636352
msg324807 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-09-07 22:10
New changeset 963bcc8b71e4ab8c9ee9a91ed1300b6e39219821 by Pablo Galindo in branch '3.6':
[3.6] bpo-34007: Skip traceback tests if the Program Counter is not available. (GH-9022)
https://github.com/python/cpython/commit/963bcc8b71e4ab8c9ee9a91ed1300b6e39219821
History
Date User Action Args
2018-09-07 22:10:31pablogsalsetmessages: + msg324807
2018-08-31 22:53:50pablogsalsetpull_requests: + pull_request8490
2018-08-31 22:47:24miss-islingtonsetmessages: + msg324451
2018-08-31 22:43:30miss-islingtonsetnosy: + miss-islington
messages: + msg324450
2018-08-31 22:23:30pablogsalsetstatus: open -> closed
stage: patch review -> resolved
2018-08-31 22:21:50miss-islingtonsetpull_requests: + pull_request8489
2018-08-31 22:21:44miss-islingtonsetpull_requests: + pull_request8488
2018-08-31 22:21:38miss-islingtonsetpull_requests: + pull_request8487
2018-08-31 22:04:50pablogsalsetmessages: + msg324446
2018-08-31 21:16:09pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8486
2018-08-29 21:22:58pablogsalsetmessages: + msg324350
2018-08-27 23:08:56pablogsalsetmessages: + msg324205
2018-08-27 23:03:41pablogsalsetmessages: + msg324204
2018-08-27 21:48:40pablogsalsetmessages: + msg324200
2018-08-27 20:21:23pablogsalsetmessages: + msg324194
2018-08-20 19:46:20pablogsalsetmessages: + msg323808
2018-07-31 18:08:08pablogsalsetmessages: + msg322805
2018-07-06 08:24:16xtreaksetnosy: + xtreak
messages: + msg321158
2018-06-30 10:10:47vstinnersetnosy: + vstinner
messages: + msg320766
2018-06-30 00:24:59pablogsalcreate