classification
Title: curses crash on FreeBSD
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, akuchling, asmodai, mark.dickinson, r.david.murray, rpetrov, skrah, vstinner
Priority: normal Keywords: buildbot, patch

Created on 2009-11-23 20:03 by mark.dickinson, last changed 2010-07-17 12:41 by skrah. This issue is now closed.

Files
File name Uploaded Description Edit
freebsd-curses.diff akuchling, 2010-04-15 21:57 Possible fix
issue7384.patch skrah, 2010-04-17 20:19
issue7384-2.patch skrah, 2010-04-18 15:18
issue7384-3-py3k.patch skrah, 2010-04-21 12:16
issue7384-4-py3k.patch skrah, 2010-04-23 09:34
issue7384-5-py3k.patch skrah, 2010-04-24 09:24
issue7384-5-trunk.patch skrah, 2010-06-03 11:21
ldd-retval-py3k.patch skrah, 2010-07-14 12:05
Messages (52)
msg95652 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-23 20:03
test_curses is currently causing the test runs to abort on the FreeBSD 6.4 
and 7.2 buildbots.

I can reproduce this on a FreeBSD 7.2 /amd64 machine by doing

./python Lib/test/regrtest.py -uall test___all__ test_curses

This dumps core, and the traceback points at the call to delwin() in 
PyCursesWindow_Dealloc, but it's far from obvious (to me) what's going 
wrong.  wo->win is not NULL here, and appears to point to a valid WINDOW.  
However, stdscr is NULL!  As far as I can tell, this shouldn't happen.

test_curses by itself doesn't crash, unless I add an 'import readline' or 
'import rlcompleter' to the top of test_curses.py.

I expect to have access to the FreeBSD machine for a few more days.  Any 
hints about what to try next would be appreciated.
msg97709 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-01-13 12:21
I've not had any success tracking the cause of this failure down, and no longer have the resources to do so.  It does appear that curses itself is broken on FreeBSD:  it's not just a problem with the tests.

Adding Andrew Kuchling to the nosy in case he has any ideas what's wrong here.

Since the test_curses crash is currently aborting the test run, and so preventing us from getting feedback from the other tests on the FreeBSD buildbots, I propose that test_curses be skipped with a "the curses module is broken on FreeBSD" message.
msg97722 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-01-13 14:59
Given your diagnosis so far, +1 on the skip.
msg99657 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-02-21 13:33
> It does appear that curses itself is broken on FreeBSD

Rereading this, it doesn't say what I meant it to say:  I meant that the Python curses module seems to be broken, not that the system-level curses library is broken (though that seems possible too).
msg99658 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-02-21 13:40
Applied the test_curses skip in r78281 (trunk);  will merge to the other branches.

Leaving this issue open, since the root cause isn't fixed.
msg99659 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-02-21 13:45
Merged to the other 3 branches in revisions r78282 (release26-maint),  r78283 (py3k), r78284 (release31-maint).
msg103231 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-15 16:51
I'm looking at this again, after installing FreeBSD 8.0/amd64 in a VM.

I've reduced Lib/test/test_curses.py to the following 9 lines:

import rlcompleter
import curses
f = open('mytempfile', 'w+b')
stdscr = curses.initscr()
stdscr.putwin(f)
f.seek(0)
curses.getwin(f)
f.close()
curses.endwin()

I then get:

$ ./python Lib/test/regrtest.py test_curses
test_curses
Bus error (core dumped)

From looking at the core dump, and tracing through with gdb, the core dump  occurs when delwin is called (from PyCursesWindow_Dealloc) on the result of curses.getwin(f), as a result of garbage collection.

The 'import rlcompleter' line appears to be necessary to cause this;  I've no idea why.
msg103256 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-15 21:03
Here's the top of the backtrace.  (Thanks asmodai for helping me out with working out how to build a FreeBSD system ncurses with debugging information.)

#0  0x0000000801460714 in cannot_delete (win=0x80116b1d0)
    at /usr/src/lib/ncurses/ncursesw/../../../contrib/ncurses/ncurses/base/lib_delwin.c:54
        p = (struct _win_list *) 0xdbdbdbdbdbdbdbdb
        result = false
#1  0x0000000801460773 in delwin (win=0x80116b1d0)
    at /usr/src/lib/ncurses/ncursesw/../../../contrib/ncurses/ncurses/base/lib_delwin.c:71
        result = -1
#2  0x000000080170d140 in PyCursesWindow_Dealloc (wo=0x800eb74c0)
    at /usr/home/dickinsm/python/svn/trunk/Modules/_cursesmodule.c:357
No locals.
#3  0x000000000046325f in _Py_Dealloc (op=0x800eb74c0) at Objects/object.c:2211
        dealloc = 0x80170d110 <PyCursesWindow_Dealloc>
#4  0x00000000004578d8 in PyDict_DelItem (op=0x800f121b0, key=0x8011062e0)
    at Objects/dictobject.c:829
        mp = (PyDictObject *) 0x800f121b0
        hash = -3668919459648339544
        ep = (PyDictEntry *) 0x8010cb5a8
        old_value = (PyObject *) 0x800eb74c0
        old_key = (PyObject *) 0x8011062e0
        __func__ = "PyDict_DelItem"
#5  0x0000000000458a48 in dict_ass_sub (mp=0x800f121b0, v=0x8011062e0, w=0x0)
---Type <return> to continue, or q <return> to quit---
    at Objects/dictobject.c:1184
No locals.
#6  0x000000000041aadd in PyObject_DelItem (o=0x800f121b0, key=0x8011062e0)
    at Objects/abstract.c:205
        m = (PyMappingMethods *) 0x6c2960
msg103261 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-04-15 21:49
Could I get a login on the buildbot to make a fix?

I bet the problem is with the stdscr object.  PyCurses_InitScr()
does 'return (PyObject *)PyCursesWindow_New(stdscr);'.

PyCursesWindow_Dealloc() does:
  if (wo->win != stdscr) delwin(wo->win);

I bet FreeBSD is clearing contents of the stdscr global variable.  The condition in PyCursesWindow_Dealloc() is then true, and it tries to delwin() the old value, which is in wo->win.

One fix might be to keep a reference to that PyCursesWindow object holding stdscr, and change dealloc to 'if (wo != saved_stdscr_object)'.  Or maybe, since multiple calls to initscr() will create multiple window objects holding the value of stdscr, window objects should have a 'do_not_delwin' flag.
msg103263 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-04-15 21:57
Here's a possible patch; it at least doesn't seem to break the module on MacOS, though MacOS doesn't crash with the current code either.
msg103264 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-15 21:58
> Could I get a login on the buildbot to make a fix?

I think David Bolen (db3l) is the maintainer.  David?
msg103265 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-15 22:00
> Here's a possible patch

Thanks.  I'll give it a try on my FreeBSD VM and report back.
BTW, did you mean to include the threading change in that patch?
msg103267 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-15 22:20
With that patch, I'm still getting the core dump (with the traceback looking pretty much as it did before).

When I traced through this with gdb, I didn't see stdscr getting set to 0 at any point.  Unless I missed any, the only curses library calls made (in sequence) were:

1. initscr() -> new window win  (=stdscr, presumably)
2. putwin(file, win)
3. getwin(file) -> new window win2, with win2 != win
4. freewin(win2) -> segfault
---
and presumably without the segfault, there would have been calls
to freewin(win) and endwin() too.


And I'm at a complete loss to explain why importing rlcompleter makes a difference.  (importing readline also causes the segfault).  I don't think it's just to do with random memory changes, since if I replace the readline or rlcompleter import by any other randomly chosen python module then there's no segfault.
msg103295 - (view) Author: Jeroen Ruigrok van der Werven (asmodai) * (Python committer) Date: 2010-04-16 06:09
For the record, this happens on FreeBSD 8 as well.

It seems it is still the same bug as what I reported back in March 2009 on the Python-dev list.

If you run the test stand-alone with ./python Lib/test/regrtest.py -uall test_curses it passes and prints "1 test OK".

If you add something like test__all__ before it it will crash with a SIGSEGV: segmentation fault (core dumped).

Mark's condensed test case switches to a SIGBUS, which is a bit different.

Mark, did your initial backtrace look like this:

#0  0x282e115e in memcpy () from /lib/libc.so.7
#1  0x282de375 in fwrite () from /lib/libc.so.7
#2  0x282de132 in fwrite () from /lib/libc.so.7
#3  0x28b7a1ca in putwin (win=0x28409640, filep=0x282f39f8)
    at /newusr/src/lib/ncurses/ncursesw/../../../contrib/ncurses/ncurses/base/lib_screen.c:132
#4  0x28d9b361 in PyCursesWindow_PutWin (self=0x28442ef0, args=0x2867f80c)
    at /home/asmodai/projects/python/Modules/_cursesmodule.c:1351
#5  0x080da60d in PyEval_EvalFrameEx (f=0x296d760c, throwflag=0)
    at Python/ceval.c:4013
#6  0x080db10e in PyEval_EvalFrameEx (f=0x296a948c, throwflag=0)
    at Python/ceval.c:4099
#7  0x080db10e in PyEval_EvalFrameEx (f=0x29692d8c, throwflag=0)
    at Python/ceval.c:4099
#8  0x080dc68b in PyEval_EvalCodeEx (co=0x297675c0, globals=0x2866bbdc,
    locals=0x2866bbdc, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0,
    defcount=0, closure=0x0) at Python/ceval.c:3253
#9  0x080dc7d7 in PyEval_EvalCode (co=0x297675c0, globals=0x2866bbdc,
    locals=0x2866bbdc) at Python/ceval.c:666
#10 0x080ef70c in PyImport_ExecCodeModuleEx (
    name=0xbfbfd683 "test.test_curses", co=0x297675c0,
    pathname=0xbfbfd223 "/home/asmodai/projects/python/Lib/test/test_curses.py")
msg103307 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-16 08:55
> Mark, did your initial backtrace look like this:

No;  the segfault was definitely happening in delwin rather than putwin.  But I did see something like your backtrace when I tried to use ncurses from ports (installed in /usr/local) rather than the system ncurses.  This was all on FreeBSD 8.0/amd64, by the way, running in a VM on Parallels.  I got the same results both when working directly within the VM terminal, and when ssh'ing to the VM from an OS X Terminal.

Maybe running this through Valgrind or something similar might show what's going on.  (Though it's not clear from a quick google whether Valgrind works on FreeBSD.)
msg103308 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-16 09:37
Valgrind can be installed by:

cd /usr/ports/devel/valgrind && make install


Then you can do (curses_test.py is your short test program):

1) valgrind --db-attach=yes --suppressions=Misc/valgrind-python.supp ./python curses_test.py

2) valgrind --suppressions=Misc/valgrind-python.supp ./python curses_test.py


Valgrind finds invalid writes. The problem with 1) is that the
terminal is in an unusable state, so controlling gdb isn't possible.

The best thing is probably to use 2) and wade through the unformatted
output starting here:

 ==12043== Invalid write of size 8
 ==12043==    at 0x27A71B7: getwin (in/li /libncursesw.so.8)                                                                             ==12043==    by 0x2A3EAAB: PyCurses_GetWin (_cursesmodule.c:1902)
       ==12043==    by 0x4573FB: PyEval_EvalFrameEx (ceval.c:3833)
                                                                  ==12043==    by 0x457DF9: PyEval_EvalCodeEx (ceval.c:3282)


(I don't have time to do that right now, I might do it later.)
msg103393 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-17 09:37
One oddity: In Mark's test case, the error only shows if readline
is imported _before_ curses. The other way around it's fine.


On FreeBSD 8.0 amd64, with the _default_ libcurses, the Valgrind output
for py3k looks like this:

[...]
==31089== Invalid write of size 8
==31089==    at 0x284F1AE: getwin (in /lib/libncursesw.so.8)
==31089==    by 0x2AE8532: PyCurses_GetWin (_cursesmodule.c:1903)
==31089==    by 0x47FBC7: call_function (ceval.c:3833)
==31089==    by 0x47AAC0: PyEval_EvalFrameEx (ceval.c:2645)
==31089==    by 0x47DF41: PyEval_EvalCodeEx (ceval.c:3282)
==31089==    by 0x47189F: PyEval_EvalCode (ceval.c:721)
==31089==    by 0x4B31AA: run_mod (pythonrun.c:1692)
==31089==    by 0x4B2FC3: PyRun_FileExFlags (pythonrun.c:1649)
==31089==    by 0x4B1734: PyRun_SimpleFileExFlags (pythonrun.c:1177)
==31089==    by 0x4B0C75: PyRun_AnyFileExFlags (pythonrun.c:963)
==31089==    by 0x4CB029: Py_Main (main.c:650)
==31089==    by 0x4150E4: main (python.c:152)
==31089==  Address 0x25c71e0 is 0 bytes after a block of size 112 alloc'd
==31089==    at 0x25A8AE: calloc (in /usr/local/lib/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==31089==    by 0x29C518A: _nc_makenew (in /lib/libncurses.so.8)
==31089==    by 0x29C569F: newwin (in /lib/libncurses.so.8)
==31089==    by 0x284F2EE: getwin (in /lib/libncursesw.so.8)
==31089==    by 0x2AE8532: PyCurses_GetWin (_cursesmodule.c:1903)
==31089==    by 0x47FBC7: call_function (ceval.c:3833)
==31089==    by 0x47AAC0: PyEval_EvalFrameEx (ceval.c:2645)
==31089==    by 0x47DF41: PyEval_EvalCodeEx (ceval.c:3282)
==31089==    by 0x47189F: PyEval_EvalCode (ceval.c:721)
==31089==    by 0x4B31AA: run_mod (pythonrun.c:1692)
==31089==    by 0x4B2FC3: PyRun_FileExFlags (pythonrun.c:1649)
==31089==    by 0x4B1734: PyRun_SimpleFileExFlags (pythonrun.c:1177)
==31089==
[...]


Then I installed the curses from /usr/ports/devel/ncurses, and the
error didn't show up any more. I'm inclined to think that the bug is
in the system ncurses. Still, it would be nice to know why the import
order matters.
msg103394 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-17 09:45
I take that back. With the curses from /usr/ports/devel/ncurses,
Mark's test case is fine, but

  ./python Lib/test/regrtest.py -uall  test_curses

fails again.
msg103395 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-17 10:31
Alas, after installing curses from /usr/ports/devel/ncurses I did not
recompile Modules/_curses_panel.c.

So, after a proper build

  ./python Lib/test/regrtest.py -uall  test_curses

shows no errors.
msg103429 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-17 20:19
It seems that FreeBSD has problems with the fact that readline.so is
linked with -lreadline and -lncursesw (why?).

With issue7384.patch I get no more errors using either Mark's test case
or test_curses.py.
msg103432 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-17 21:12
That patch works for me, too.  Nice!

> It seems that FreeBSD has problems with the fact that readline.so is
> linked with -lreadline and -lncursesw (why?).

Good question...
msg103497 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-18 15:18
To clarify a couple of things:

On some systems (Redhat?) readline is not linked against ncurses in order to give the user the possibility to choose. This is why setup.py
has to select an ncurses version.

However, things can go wrong if readline is already linked against
a specific ncurses version. On FreeBSD-8.0 this version is ncurses,
but setup.py selects ncursesw:


stefan@freebsd-amd64:~> ldd /lib/libreadline.so.8 
/lib/libreadline.so.8:
        libncurses.so.8 => /lib/libncurses.so.8 (0x800b3e000)
        libc.so.7 => /lib/libc.so.7 (0x800648000)
stefan@freebsd-amd64:~> ls /lib/libncurses*
/lib/libncurses.so.8    /lib/libncursesw.so.8


issue7384.patch suppresses the selection, but is a little primitive.

I've created a new patch, which does the following:

  1) Detect if readline is already linked against ncurses and
     if so, skip any further selection. This must be done.

  2) Use the same version of ncurses for readline.so and _curses.so.


I'm not sure if 2) is necessary. With the previous patch, readline.so
was linked against ncurses and _curses.so against ncursesw. All tests
were passed though.



Any thoughts whether readline.so and _curses.so should link against
the same curses library?
msg103503 - (view) Author: Jeroen Ruigrok van der Werven (asmodai) * (Python committer) Date: 2010-04-18 16:27
Just to state the obvious: ncursesw is needed for wide character support (i.e. Unicode).

Also, have you tried asking Thomas Dickey (dickey@invisible-island.net) about this? He might be able to give some clue about it since he's the main curses maintainer.
msg103828 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-21 12:16
Jeroen, thanks for the idea. I asked Thomas Dickey and he said that
one should not load both libncurses.so and libncursesw.so.

I think this means that if libreadline.so is already linked against
libncurses.so, we are stuck with libncurses.so for the curses module.


If this affects users who want the wide character version, they could
file a bug report with their distro:

Thomas Dickey pointed out that there are two ways for a distro to
deal with this problem:

  1) Link libreadline against ncursesw.

  2) Split out the termcap interface (which readline uses) as
     libtinfo. This is a configure option for ncurses and SuSE
     and Redhat are doing this.


I'm attaching a new patch against py3k that makes sure that the
readline and curses modules use the same curses library.

(This does not apply to Darwin, but I don't want to touch that logic.)
  

I'm going to test the patch on py3k-cdecimal to see if it works on
the buildbots.
msg103838 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-21 13:08
This patch looks good to me, assuming that the buildbots are happy.  I agree that this seems like a sensible solution for now, even if it means limiting users to ncurses rather than ncursesw.

I was initially a bit surprised that it works on OS X, since OS X doesn't have 'ldd';  but in that case the os.system call simply outputs "sh: ldd: command not found" to stderr and (presumably) nothing to stdout;  no Python exception is raised, so it's all okay.  It might be worth adding code to avoid the os.system('ldd ...') call on OS X, just to avoid the unnecessary error message on the console.  Apart from this, I say +1 to applying the patch.

Many thanks for all the detective work!
msg103980 - (view) Author: Roumen Petrov (rpetrov) * Date: 2010-04-22 21:42
Instead to test in setup.py we could use result from configure script - just uncomment line and use it
msg103996 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-23 09:34
Mark, thanks for reviewing the patch. In the new patch, I added a skip
for OS X.

Buildbot testing looks good. In particular, one FreeBSD bot passes
test_curses now (the other one is hanging in multiprocessing).

For most bots nothing changes. The solaris bot has the same unrelated
failures as before. Ubuntu sparc previously did the same weird linking
(readline already linked with ncurses, but using -lncursesw) and now
uses ncurses throughout. Tests pass. Debian sparc did the same, tests
give the same failures as before ("getmouse returned ERR", almost certainly
unrelated.)


Roumen, I do not see a line in configure.in that tests for the
libraries that readline is linked against.
msg103997 - (view) Author: Jeroen Ruigrok van der Werven (asmodai) * (Python committer) Date: 2010-04-23 10:02
I did some digging on my side, the fact you see ncurses referenced from readline is due to the build linking readline to libtermcap:

cc  -fstack-protector -shared -Wl,-x  -o libreadline.so.8 -Wl,-soname,libreadline.so.8  `lorder readline.So vi_mode.So funmap.So keymaps.So parens.So search.So rltty.So complete.So bind.So isearch.So display.So signals.So util.So kill.So undo.So macro.So input.So callback.So terminal.So text.So nls.So misc.So compat.So xmalloc.So history.So histexpand.So histfile.So histsearch.So shell.So mbutil.So tilde.So | tsort -q` -ltermcap

And libtermcap is:

% ll /usr/lib/libtermcap.so*
0 lrwxr-xr-x  1 root  wheel  -   13B 18 apr 08:29 /usr/lib/libtermcap.so@ -> libncurses.so


That configuration option you referenced, Stefan, is that --with-termlib (generate separate terminfo library)?
msg104000 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-23 10:22
Yes, readline uses only the termcap part of ncurses. I think that
--with-termlib is the correct option, see:

http://www.mail-archive.com/util-linux-ng@vger.kernel.org/msg00273.html
msg104002 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-23 10:31
Actually this means that we should also look for -ltinfo in the ldd
check (A Redhat buildbot would be nice).
msg104054 - (view) Author: Roumen Petrov (rpetrov) * Date: 2010-04-23 21:40
> Roumen, I do not see a line in configure.in that tests for the
> libraries that readline is linked against.

The test in configure is how to link application to readline libs.

Platforms that support linking of shared libraries with unresolved 
symbols cannot link readline to termcap compatible library if they offer 
more then one. I think that this is the bug in package build on those 
system as this limit applications to use other termcap libraries.

Not all linux link readline to  termcap compatible library:
- SuSe (checked on 11.0) linked to ncurses :(
- Fedora (verified v 12) and Slackware - not linked . So no issue 
(before) on those platforms as application can link to any termcap 
compatible library and python will select ncursesw. On those platforms I 
expect Stefan patch to return empty string and python to fail to build 
readline module.

As configure detect how to link readline we could uncomment 
READLINE_LIBS and to add as makefile macroand to use by setup.py. If 
READLINE_LIBS contain only -lreadline => on this platform readline is 
already linked to termcap compatible library.

Also detection of dependent libraries that use ldd is limited to 
platforms that has this command, i.e. is not portable.
If distutils support a method that return dependency libraries we could 
use. (

I'm not familiar with python curses module to propose a patch .
Is possible to to run sample program to detect readline curses library ?

Or may be to try to link sample "int main() { readline(); }" and to ask 
compiler/linker to warn for duplicate symbols. Something like :
$ gcc -Wl,--warn-common test-readline.c -lreadline -lncursesw  -lncursesw
$ gcc -Wl,--warn-common test-readline.c -lreadline -ltermcap -lncurses
.../libncurses.so: warning: common of `ospeed' overridden by larger common
.../libtermcap.so: warning: larger common is here
$ gcc -Wl,--warn-common test-readline.c -lreadline -ltermcap -lncursesw
..../libncursesw.so: warning: common of `ospeed' overridden by larger common
..../../libtermcap.so: warning: larger common is here
FIXME with more portable and more correct command.

Roumen
msg104057 - (view) Author: Roumen Petrov (rpetrov) * Date: 2010-04-23 21:52
Stefan Krah wrote:
>
> Stefan Krah<stefan-usenet@bytereef.org>  added the comment:
>
> Actually this means that we should also look for -ltinfo in the ldd
> check (A Redhat buildbot would be nice).

Or may be this mean that in configure to add test with -ltinfo and if 
readline link succeed then is save to link python curses module with 
first curses library found.

ldd - what about platforms without GNU libc ?

Roumen
msg104070 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-24 09:24
I included the test for libtinfo in the latest patch. The patch is tested
on Fedora and correctly links the curses module with -lncursesw.

This means that the ldd method works on all buildbots, OpenBSD, OpenSolaris
and Fedora.
msg104071 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-24 09:33
I'm not against sorting things out in configure.in, but I'm not quite
sure that it will be more portable than ldd:

On FreeBSD (the problem system!) I can't get this to work:

[stefan@freebsd-i386 ~]$ echo 'int main() { readline(); }' > test_readline.c
[stefan@freebsd-i386 ~]$ gcc -Wl,--warn-common xxx.c -lreadline -ltermcap -lncurses -lncursesw
[stefan@freebsd-i386 ~]$ gcc -Wl,--warn-common xxx.c -lreadline -lncurses -lncursesw
[stefan@freebsd-i386 ~]$ gcc -Wl,--warn-common xxx.c -lreadline -lncursesw


On OpenSolaris with suncc, ld does not have -warn-common.
msg104074 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-24 11:19
Sigh. xxx.c == test_readline.c in the previous comment.
msg104283 - (view) Author: Roumen Petrov (rpetrov) * Date: 2010-04-26 22:20
Yes , I  understand .
For the protocol did gcc on FreeBSD warn if library order is -lncursesw 
  -lreadline  ?
Forget for

Also I'm not able to write C test case similar to python msg103231 by 
Mark Dickinson that fail on system where readline library is not linked 
to ncurses. Always program work and didn't code dump(=bus error) 
nevertheless order of ncurses (with w and without w suffix) and readline 
libraries.

So if there is no way to write C test program that fail I could not see 
ather way to detect issue except to parse result from programs that 
output library dependencies. Also I expect this to fail for static build 
(--disable-shared).
I'm not sure that readline library work well with static builds - but 
this is another issue and my time machine is stop working :) .

To write script that check platform and if is freebsd, suse link with a, 
b, c if os is XX link with d, e, f will work with shared and static 
build - It is not reasonable solution :(

P.S. Issue with readline library linked to termcap compatible library on 
system that distribute more then one termcap compatible library is about 
10 years old.

Roumen
msg104302 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-27 09:17
Roumen Petrov <report@bugs.python.org> wrote:
> Yes , I  understand .
> For the protocol did gcc on FreeBSD warn if library order is -lncursesw 
>   -lreadline  ?

No.

> P.S. Issue with readline library linked to termcap compatible library on 
> system that distribute more then one termcap compatible library is about 
> 10 years old.

I didn't want to touch the termcap logic. There's potential for breakage,
and a real investigation would be time consuming.

(There's a needless warning on Tiger about /usr/lib/termcap that could
be fixed in another issue.)
msg104311 - (view) Author: Jeroen Ruigrok van der Werven (asmodai) * (Python committer) Date: 2010-04-27 11:04
Stefan, I was emailing with Rong-En Fan, a FreeBSD committer, about this issue and he asked:

"Basically, this is caused by

  a) our readline.so is linked against ncurses.so (via -ltermcap which is the same lib)
  b) wide-character enabled ncurses, ncursesw.so, is also loaded in the same process

To solve that, we need to have a separate termcap.so, do I understand the issue correctly?"

He also mentioned that "[a]nother more aggressive way is to make only ncursesw installed into the system which requires a recompilation of all ports that use ncurses (ncurses and ncursesw are source compatible, but in most cases they are binary compatible as long as application don't assume size of ncurses structures)."

Which I fully support, it's something that I did on DragonFly BSD a long time ago already (for all I can remember).

Your opinion?
msg104315 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-04-27 12:43
Jeroen Ruigrok van der Werven <report@bugs.python.org> wrote:
> Stefan, I was emailing with Rong-En Fan, a FreeBSD committer, about this issue and he asked:
>
> "Basically, this is caused by
>
>   a) our readline.so is linked against ncurses.so (via -ltermcap which is the same lib)
>   b) wide-character enabled ncurses, ncursesw.so, is also loaded in the same process
>
> To solve that, we need to have a separate termcap.so, do I understand the issue correctly?"

Yes, only that the separate termcap is called libtinfo.so. The approach of
splitting out libtinfo from ncurses (used by Fedora) is the most flexible
and allows the user to choose ncurses or ncursesw.

[stefan@fedora-amd64 ~]$ ldd /lib64/libreadline.so.6.0
        linux-vdso.so.1 =>  (0x00007fff725ff000)
        libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00000036e4a00000)
        libc.so.6 => /lib64/libc.so.6 (0x00000036d9600000)
        /lib64/ld-linux-x86-64.so.2 (0x00000036d9200000)

+ports that use ncurses (ncurses and ncursesw are source compatible, but in most cases they are binary compatible as long as application don't
+assume size of ncurses structures)."
>
> Which I fully support, it's something that I did on DragonFly BSD a long time ago already (for all I can remember).
>
> Your opinion?

I think the libtinfo approach is more flexible, and I'm not aware of any drawbacks.
So, for FreeBSD, I'd use it.

Stefan Krah
msg106199 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-20 22:17
I tested issue7384-5-py3k.patch	on FreeBSD 8.0: it fixes the crash.
msg106939 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-06-03 11:21
I think it would be nice to get this into 2.7. I don't expect buildbot
failures, since the 2.7 patch is essentially the same as the py3k version,
which has been tested extensively.
msg106940 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-03 11:35
> I think it would be nice to get this into 2.7.

Agreed.  I think you should go ahead and commit it.
msg106948 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-06-03 12:57
Mark, thanks. Committed in r81669; I'll keep an eye on the buildbots.
msg107323 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-06-08 14:36
Committed in r81669,r81672,r81683 (trunk) and r81830,81831 (py3k).


What to do with the releases? To recap, the fix is:

  1) Detect if readline is already linked against ncurses and
     if so, skip any further selection. This must be done.

  2) Use the same version of ncurses for readline.so and _curses.so.


1) should be done in any case. 2) could change the behavior for
users who previously had readline/ncurses, cursesmodule/ncursesw,
but only use the cursesmodule in an application.
msg107999 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-06-17 10:07
Committed a conservative version implementing part 1) in r82017 (2.6) and
r82019 (3.1). Part 2) can be enabled by uncommenting a couple of lines in
setup.py.

The buildbots look good, but I'm setting this to 'pending' in case
someone would like part 2) of the fix in the releases.
msg110222 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2010-07-13 19:40
These changes break building of Python 3.* in some locales in Gentoo.

running build
running build_ext
Traceback (most recent call last):
  File "./setup.py", line 1812, in <module>
    main()
  File "./setup.py", line 1807, in main
    "Tools/scripts/2to3"]
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/core.py", line 152, in setup
    dist.run_commands()
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/dist.py", line 946, in run_commands
    self.run_command(cmd)
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/dist.py", line 965, in run_command
    cmd_obj.run()
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/command/build.py", line 127, in run
    self.run_command(cmd_name)
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/cmd.py", line 315, in run_command
    self.distribution.run_command(command)
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/dist.py", line 965, in run_command
    cmd_obj.run()
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/distutils/command/build_ext.py", line 393, in run
    self.build_extensions()
  File "./setup.py", line 151, in build_extensions
    missing = self.detect_modules()
  File "./setup.py", line 539, in detect_modules
    for ln in fp:
  File "/var/tmp/portage/dev-lang/python-3.2_pre20100711/work/Python-3.2_pre20100711/Lib/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 20: ordinal not in range(128)
make: *** [sharedmods] Error 1

In lt_LT.UTF-8 locale, readline_termcap_lib file contains:
	ne dinaminis paleidžiamasis failas

In en_US.UTF-8 locale, this file would contain:
	not a dynamic executable

do_readline is "/usr/lib64/libreadline.so".

/usr/lib64/libreadline.so is a linker script with the following content:
/* GNU ld script
   Since Gentoo has critical dynamic libraries in /lib, and the static versions
   in /usr/lib, we need to have a "fake" dynamic lib in /usr/lib, otherwise we
   run into linking problems.  This "fake" dynamic lib is a linker script that
   redirects the linker to the real lib.  And yes, this works in the cross-
   compiling scenario as the sysroot-ed linker will prepend the real path.

   See bug http://bugs.gentoo.org/4411 for more info.
 */
OUTPUT_FORMAT ( elf64-x86-64 )
GROUP ( /lib64/libreadline.so.6 )

I think that using ldd is a wrong idea.
msg110224 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-13 20:07
In Ubuntu I can build just fine with lt_LT.UTF-8. So perhaps this problem
should be addressed in Gentoo.
msg110225 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2010-07-13 20:44
You shouldn't use ldd. I suggest that setup.py try to link a small executable, which would use a function from libcurses and would be linked against libreadline, but not libcurses. If linking succeeds, then you libreadline is linked against libcurses. If linking fails, then repeat this procedure with libcursesw, libncurses, libncursesw, libtinfo.
msg110238 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-07-13 22:13
> "In lt_LT.UTF-8 locale, readline_termcap_lib file contains: 
> ne dinaminis paleidžiamasis failas"

You can run ldd without LANG variable to get the original (english, ascii only) message.
msg110271 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-14 12:05
So you have garbage from stderr in readline_termcap_lib. Since that's
useless anyway (no matter what locale is set), let's check the return
value of os.system().

The attached patch skips readline linkage detection if ldd fails. In
that case, linking will be done in the same manner as before r81830.


Please report if the patch allows you to build py3k in the problematic
locale.


Your method of detecting readline linkage looks interesting, but I
doubt that I'm going to implement it: These cross platform issues
take an *immense* amount of time, since you have to test on all
buildbot platforms (+ OpenBSD and OpenSolaris), with different
compilers (icc, suncc).

If you want that done, the best way is to open another issue, submit a
patch (probably for configure.in) _and_ do all the testing.
msg110378 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2010-07-15 16:14
This patch allows to build Python 3.* in this locale.

It might be safer to open tmpfile in binary mode to avoid potential problems with non-ASCII characters in paths to libraries.
msg110550 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2010-07-17 12:41
ldd return value check committed in r82927, r82928, r82929 and r82930.


Thanks for reporting this!
History
Date User Action Args
2010-07-17 12:41:55skrahsetstatus: open -> closed

messages: + msg110550
2010-07-15 16:14:09Arfreversetmessages: + msg110378
2010-07-14 12:05:14skrahsetfiles: + ldd-retval-py3k.patch

messages: + msg110271
2010-07-13 22:13:21vstinnersetmessages: + msg110238
2010-07-13 20:44:03Arfreversetmessages: + msg110225
2010-07-13 20:07:30skrahsetmessages: + msg110224
2010-07-13 19:40:13Arfreversetstatus: pending -> open
nosy: + Arfrever
messages: + msg110222

2010-06-17 10:07:20skrahsetstatus: open -> pending
resolution: accepted
messages: + msg107999

stage: patch review -> resolved
2010-06-08 14:36:37skrahsetmessages: + msg107323
2010-06-03 12:57:13skrahsetmessages: + msg106948
2010-06-03 11:35:30mark.dickinsonsetmessages: + msg106940
2010-06-03 11:21:37skrahsetfiles: + issue7384-5-trunk.patch

messages: + msg106939
2010-05-20 22:17:05vstinnersetnosy: + vstinner
messages: + msg106199
2010-04-27 12:43:15skrahsetmessages: + msg104315
2010-04-27 11:04:21asmodaisetmessages: + msg104311
2010-04-27 09:18:00skrahsetmessages: + msg104302
2010-04-26 22:20:29rpetrovsetmessages: + msg104283
2010-04-24 11:19:21skrahsetmessages: + msg104074
2010-04-24 09:33:13skrahsetmessages: + msg104071
2010-04-24 09:24:33skrahsetfiles: + issue7384-5-py3k.patch

messages: + msg104070
2010-04-23 21:52:35rpetrovsetmessages: + msg104057
2010-04-23 21:40:28rpetrovsetmessages: + msg104054
2010-04-23 10:31:13skrahsetmessages: + msg104002
2010-04-23 10:22:30skrahsetmessages: + msg104000
2010-04-23 10:02:06asmodaisetmessages: + msg103997
2010-04-23 09:34:59skrahsetfiles: + issue7384-4-py3k.patch

messages: + msg103996
2010-04-22 21:42:28rpetrovsetnosy: + rpetrov
messages: + msg103980
2010-04-21 13:08:48mark.dickinsonsetmessages: + msg103838
2010-04-21 12:16:21skrahsetfiles: + issue7384-3-py3k.patch

messages: + msg103828
stage: needs patch -> patch review
2010-04-18 16:27:14asmodaisetmessages: + msg103503
2010-04-18 15:19:01skrahsetfiles: + issue7384-2.patch

messages: + msg103497
2010-04-17 21:12:48mark.dickinsonsetmessages: + msg103432
2010-04-17 20:19:36skrahsetfiles: + issue7384.patch

messages: + msg103429
2010-04-17 10:31:09skrahsetmessages: + msg103395
2010-04-17 09:45:34skrahsetmessages: + msg103394
2010-04-17 09:37:22skrahsetmessages: + msg103393
2010-04-16 09:38:01skrahsetnosy: + skrah
messages: + msg103308
2010-04-16 08:55:28mark.dickinsonsetmessages: + msg103307
2010-04-16 06:09:50asmodaisetmessages: + msg103295
2010-04-15 22:20:09mark.dickinsonsetmessages: + msg103267
2010-04-15 22:11:53asmodaisetnosy: + asmodai
2010-04-15 22:00:23mark.dickinsonsetmessages: + msg103265
2010-04-15 21:58:18mark.dickinsonsetmessages: + msg103264
2010-04-15 21:57:47akuchlingsetfiles: + freebsd-curses.diff
keywords: + patch
messages: + msg103263
2010-04-15 21:49:52akuchlingsetmessages: + msg103261
2010-04-15 21:03:09mark.dickinsonsetmessages: + msg103256
2010-04-15 16:51:13mark.dickinsonsetmessages: + msg103231
2010-02-21 13:45:02mark.dickinsonsetmessages: + msg99659
2010-02-21 13:40:29mark.dickinsonsetassignee: mark.dickinson ->
type: behavior
messages: + msg99658
stage: needs patch
2010-02-21 13:33:11mark.dickinsonsetmessages: + msg99657
2010-01-13 14:59:49r.david.murraysetnosy: + r.david.murray
messages: + msg97722
2010-01-13 12:21:01mark.dickinsonsetpriority: normal
title: test_curses crash on FreeBSD buildbots -> curses crash on FreeBSD
nosy: + akuchling

messages: + msg97709

assignee: mark.dickinson
2009-11-23 20:03:47mark.dickinsoncreate