Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bus error in pybuilddir.txt 'python -m sysconfigure --generate-posix-vars' build step #65365

Closed
vstinner opened this issue Apr 7, 2014 · 10 comments
Labels
build The build process and cross-build

Comments

@vstinner
Copy link
Member

vstinner commented Apr 7, 2014

BPO 21166
Nosy @doko42, @jcea, @vstinner, @ned-deily, @koobs
Files
  • gdb.log
  • issue21166_27.patch: 2.7 version
  • issue21166_3x.patch: 3.x version
  • python-buildbot-broken-debugging.txt: Bus Error debug & isolation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-08-22.20:41:06.457>
    created_at = <Date 2014-04-07.09:32:57.267>
    labels = ['build']
    title = "Bus error in pybuilddir.txt 'python -m sysconfigure --generate-posix-vars' build step"
    updated_at = <Date 2014-09-26.02:37:18.790>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2014-09-26.02:37:18.790>
    actor = 'jcea'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-08-22.20:41:06.457>
    closer = 'ned.deily'
    components = ['Build']
    creation = <Date 2014-04-07.09:32:57.267>
    creator = 'vstinner'
    dependencies = []
    files = ['34749', '36353', '36354', '36356']
    hgrepos = []
    issue_num = 21166
    keywords = ['patch']
    message_count = 10.0
    messages = ['215683', '215699', '215702', '215704', '215782', '225217', '225220', '225706', '225707', '225779']
    nosy_count = 6.0
    nosy_names = ['doko', 'jcea', 'vstinner', 'ned.deily', 'python-dev', 'koobs']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue21166'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 7, 2014

    @koobs
    Copy link

    koobs commented Apr 7, 2014

    Uploading gdb output at Victors request

    @koobs
    Copy link

    koobs commented Apr 7, 2014

    Interestingly, I note the following lines from the gdb log:

    #5 0x0000000801ae1e99 in PyModule_Create2 () from /usr/local/lib/libpython3.4m.so.1
    #6 0x0000000801840de8 in PyInit__heapq () from /usr/local/lib/python3.4/lib-dynload/_heapq.so

    I had installed Python 3.4 just prior to Victor reporting the issue.

    If its at all relevant, Python 3.4 was built using clang (not gcc, which the buildbots use)

    Removing Python 3.4 from the system and rebuilding makes the issue go away.

    The question is, what is ./python from the buildbot build directory doing using, loading or otherwise interacting with the python installation on the system in the first place? Is a lack of isolation the root cause?

    @koobs
    Copy link

    koobs commented Apr 7, 2014

    Clarification:

    a) I had just installed Python 3.4 (at the system level, via ports)

    a) Removing Python 3.4 from the system and (forcing a rebuild of the buildbot) makes the issue go away.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 9, 2014

    I still don't understand the issue but... it's now fixed (I don't understand why), so I'm closing it.

    @vstinner vstinner closed this as completed Apr 9, 2014
    @ned-deily
    Copy link
    Member

    This problem has reappeared on some of the freebsd buildbots, for example:

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%202.7/builds/507

    Thanks to a lot of good work by koobs in investigating and documenting the problems in irc, we have figured out what is going on here (and it's a lot more difficult to explain than to fix!).

    The root cause is that there is a "bootstrap" issue with the pybuilddir.txt Makefile rule. This build step is the first step that uses the newly-built Python executable; it creates the _sysconfigdata.py source that contains the compiled Makefile variables and it also creates pybuilddir.txt which contains a platform-dependent build directory name, primarily for the benefit of cross-compile builds. This support was added by the changes for bpo-13150 and bpo-17512. They added code in getpath.c to look for and use the build directory name from pybuilddir.txt for getpath.c to determine that the interpreter is being started from a build directory rather than from an installed instance. In the former case, the code in getpath.c is supposed to set up both sys.prefix (for pure modules) and sys.exec_prefix (for C extension modules) so that standard library modules are loaded from the build/source directories.

    However, if pybuilddir.txt does not already exist when the pybuilddir.txt Makefile rule executes (such as what happens with a clean build directory), getpath.c gets confused: search_for_prefix correctly determines that python is running from a build directory but search_for_exec_prefix does not. This means that the sys.path that is created for this initial run of the newly-built skeleton python will cause it to find the right pure python modules in the source/build directories but it will use the installed location (as set by --prefix, default /usr/local) to search for C standard library shared extension modules (.so's). Now, at this point, no shared .so's could have been built yet (in a clean build) and the -m sysconfig --generate-posix-vars step therefore cannot depend on any such modules. But, if sys.exec_prefix does get set (incorrectly) to an installed path (because pybuilddir.txt does not exist yet) *and* there happen to be .so's there from a previous installation, those .so's can get imported and attempted to be used. One such case in Python 2.7.x builds is cStringIO.so which is conditionally used by pprint if it is available, falling back to StringIO.py if not. It so happens that pprint is used by sysconfig _generate-posix-vars in that build step.

    Now it seems that most of the time, the spurious import of incorrect extension modules at this point is harmless. However, there are configurations where that is not the case. One such scenario is that of koobs's freebsd buildbot. In this case, there was already an installed version of Python 2.7 built via the FreeBSD ports system with --enable-shared, a default prefix of /usr/local, and with a wide (ucs4) Unicode build. The buildbot is configured non-shared, with debug enabled, and defaulting to a narrow (ucs2) build and /usr/local prefix. Even though the buildbot build is never installed, whenever pybuilddir.txt did not already exist in the build directory (after a manual clean), getpath's search_for_exec_prefix ended up incorrectly adding /usr/local/lib/pythonx.x/lib-dynload to sys.path and causing cStringIO.so with a conflicting build ABI from the installed system Python to be imported and used, which can be seen in gdb traces to be the cause of the bus error. (With Python 3.x, there is a different scenario that can result in an installed _heapq.so being imported but the root cause is the same.)

    After finally isolating the scenario, I tried unsuccessfully to reproduce the bus error on some other platforms (e.g. OS X) but I was able to reproduce it on a FreeBSD 10 VM. While this may appear to be a rather obscure scenario, there is at least one other open issue (bpo-21412) which seems to be due to the same root cause so it is definitely worth fixing. Rather than adding to the complexity of getpath.c, I think the best way to deal with this is in the Makefile. The attached patches change the pybuilddir.txt rule's recipes to unconditionally create a pybuilddir.txt with a dummy path value which is sufficient to ensure that sys.exec_prefix does not point to the installed path location during this initial step. Further, the patches also cause ./configure to always delete an existing pybuilddir.txt so that it will be properly recreated in case the build environment has changed.

    I'm cc'ing Matthias here for a review for any cross-compile implications; AFAICT, there shouldn't be any.

    @ned-deily ned-deily added the build The build process and cross-build label Aug 12, 2014
    @ned-deily ned-deily reopened this Aug 12, 2014
    @ned-deily ned-deily changed the title Bus error on "AMD64 FreeBSD 9.x 3.4" buildbot Bus error in pybuilddir.txt 'python -m sysconfigure --generate-posix-vars' build step Aug 12, 2014
    @koobs
    Copy link

    koobs commented Aug 12, 2014

    :DDD

    This was an awesome experience working with you Ned, thanks for all the help.

    Attaching my debugging & isolation steps for additional detail, posterity and reference.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 22, 2014

    New changeset edb6b282469e by Ned Deily in branch '2.7':
    Issue bpo-21166: Prevent possible segfaults and other random failures of
    http://hg.python.org/cpython/rev/edb6b282469e

    New changeset e52d85f2e284 by Ned Deily in branch '3.4':
    Issue bpo-21166: Prevent possible segfaults and other random failures of
    http://hg.python.org/cpython/rev/e52d85f2e284

    New changeset 599dc1304a70 by Ned Deily in branch 'default':
    Issue bpo-21166: merge from 3.4
    http://hg.python.org/cpython/rev/599dc1304a70

    @ned-deily
    Copy link
    Member

    Committed for release in 2.7.9, 3.4.2, and 3.5.0

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 24, 2014

    New changeset 5a157e3b3c47 by Ned Deily in branch '2.7':
    Issue bpo-21166: fix typo in comment
    http://hg.python.org/cpython/rev/5a157e3b3c47

    New changeset 9b1bd9d42cc7 by Ned Deily in branch '3.4':
    Issue bpo-21166: fix typo in comment
    http://hg.python.org/cpython/rev/9b1bd9d42cc7

    New changeset 5ee9c99a4ca3 by Ned Deily in branch 'default':
    Issue bpo-21166: fix typo in comment
    http://hg.python.org/cpython/rev/5ee9c99a4ca3

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    build The build process and cross-build
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants