Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add call trampoline to work around bad fpcasts on Emscripten #91318

Closed
tiran opened this issue Mar 30, 2022 · 7 comments
Closed

Add call trampoline to work around bad fpcasts on Emscripten #91318

tiran opened this issue Mar 30, 2022 · 7 comments
Labels
3.11 only security fixes build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@tiran
Copy link
Member

tiran commented Mar 30, 2022

BPO 47162
Nosy @tiran, @miss-islington, @hoodmane
PRs
  • bpo-47162: Add call trampoline to mitigate bad fpcasts on Emscripten #32189
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-03-30.19:28:48.089>
    created_at = <Date 2022-03-30.11:11:40.210>
    labels = ['interpreter-core', 'build', '3.11']
    title = 'Add call trampoline to work around bad fpcasts on Emscripten'
    updated_at = <Date 2022-04-02.02:37:08.500>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2022-04-02.02:37:08.500>
    actor = 'hoodchatham'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-03-30.19:28:48.089>
    closer = 'christian.heimes'
    components = ['Interpreter Core']
    creation = <Date 2022-03-30.11:11:40.210>
    creator = 'christian.heimes'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 47162
    keywords = ['patch']
    message_count = 7.0
    messages = ['416339', '416394', '416515', '416520', '416521', '416522', '416530']
    nosy_count = 3.0
    nosy_names = ['christian.heimes', 'miss-islington', 'hoodchatham']
    pr_nums = ['32189']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'compile error'
    url = 'https://bugs.python.org/issue47162'
    versions = ['Python 3.11']

    @tiran
    Copy link
    Member Author

    tiran commented Mar 30, 2022

    WASM cannot handle function calls with bad function pointer casts yet. Hood Chatham explained the issue in Pyodide patch https://github.com/pyodide/pyodide/blob/0.19.1/cpython/patches/0001-call-trampolines-to-handle-fpcast-troubles.patch

    ---
    The wasm call_indirect instruction takes a function signature as an immediate
    argument (an immediate argument is one which is determined statically at compile
    time). The web assembly runtime checks at runtime that the function pointer we
    are attempting to call has a signature which matches the asserted function
    signature, if they don't match, the runtime traps with "indirect call signature
    mismatch". The codegen provided by default by the emscripten toolchain produces
    an ABI that also has this constraint.

    By contrast, typical native ABIs handle function pointer casts very gracefully:
    extra arguments are ignored, missing arguments are filled in as 0, even wrong
    return values are completely fine.

    The Python ecosystem is full of code which contain function pointer casts.
    Emscripten offers a second "EMULATE_FPCASTS" ABI that fixes many of these
    issues, but it shims in the function pointer cast support in a binaryen pass
    that happens on the fully linked wasm module. At this point, lots of information
    has been destroyed and as a result the only possible solutions are extremely
    costly in run time, stack space, and code size.

    We wish to avoid these costs. Patching the packages is prohibitively time
    consuming and boring, especially given our limited developer effort. I have
    explored making automated detection tools, and these work. However, getting
    these tools included into the CI of Python ecosystem packages would be
    prohibitively time consuming.

    There is a best of both worlds solution. It is possible to define an ABI which
    allows for specific types of function pointer casts with neglible costs. I hope
    to do this in future work. However, this will require custom llvm passes.
    Because llvm is implemented in C++ which has very poor ABI compatibility, this
    means our whole toolchain needs to be built against the same version of llvm,
    from llvm all the way down to binaryen and the wasm binary toolkit.

    In the meantime, most bad function pointer calls happen in a small number of
    locations. Calling a function pointer from Javascript has no restrictions on
    arguments. Extra arguments can be provided, arguments can be left off, etc with
    no issue (this mimics Javascript's typical function call semantics). So at the
    locations where the bad calls occur, we patch in a trampoline which calls out to
    Javascript to make the call for us. Like magic, the problem is gone. Research
    shows that the performance cost of this is surprisingly low too.
    ---

    Bad function pointer casts lead to fatal runtime errors like this:

    worker.js onmessage() captured an uncaught exception: RuntimeError: function signature mismatch
    RuntimeError: function signature mismatch
    at create_builtin (<anonymous>:wasm-function[3512]:0x14f6df)
    at _imp_create_builtin (<anonymous>:wasm-function[3520]:0x14fe1d)
    at cfunction_vectorcall_O (<anonymous>:wasm-function[1871]:0x91e84)
    at _PyVectorcall_Call (<anonymous>:wasm-function[839]:0x4e220)
    at _PyObject_Call (<anonymous>:wasm-function[842]:0x4e4f5)
    at PyObject_Call (<anonymous>:wasm-function[843]:0x4e595)
    at _PyEval_EvalFrameDefault (<anonymous>:wasm-function[3114]:0x120541)
    at _PyEval_Vector (<anonymous>:wasm-function[3115]:0x1229ab)
    at _PyFunction_Vectorcall (<anonymous>:wasm-function[845]:0x4e6ca)
    at object_vacall (<anonymous>:wasm-function[859]:0x4f3d0)
    worker sent an error! undefined:undefined: function signature mismatch

    I propose to include the downstream patch from Pyodide but make the call trampoline feature opt-in. An opt-in allows Pyodide to use the trampolines in production while we can test core and 3rd party modules without the trampoline easily.

    @tiran tiran added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) build The build process and cross-build labels Mar 30, 2022
    @tiran tiran closed this as completed Mar 30, 2022
    @tiran tiran closed this as completed Mar 30, 2022
    @miss-islington
    Copy link
    Contributor

    New changeset 581c443 by Christian Heimes in branch 'main':
    bpo-47162: Add call trampoline to mitigate bad fpcasts on Emscripten (GH-32189)
    581c443

    @hoodmane
    Copy link
    Mannequin

    hoodmane mannequin commented Apr 1, 2022

    Note that there are still Python test suite failures in emscripten without these trampolines enabled, for instance test_imp fails. The only trampoline needed to pass the core test suite is _PyImport_InitFunc_TrampolineCall.

    @tiran
    Copy link
    Member Author

    tiran commented Apr 1, 2022

    Interesting, can you pin point the buggy function pointer? If you link with -gsource-map and deploy the map file, your browser should give you a decent stack trace.

    @hoodmane
    Copy link
    Mannequin

    hoodmane mannequin commented Apr 1, 2022

    I'm having trouble pinpointing the bad function pointer because chrome cuts off the end of the wasm file with:

    ;; .... text is truncated due to size
    

    Maybe I should follow the directions here, since this is a consistent nuisance.
    https://www.diverto.hr/en/blog/2020-08-15-WebAssembly-limit/
    I have a stack trace but the key info of what function is being called is missing. If chrome didn't truncate the wasm I could figure out what the callee is too.

    I looked into this before at some point and IIRC the issue is that one initialization code path hands the Init function a spec but the a different path doesn't give it the spec. It wasn't obvious how to fix the problem.

    @hoodmane
    Copy link
    Mannequin

    hoodmane mannequin commented Apr 1, 2022

    As an update, building the chrome devtools with the wasm limit set to 12Mb following that guide worked fantastic. Highly recommend it. (Though there are a few paths that are out of date, I had to consult google's devtools docs.)

    The problem is that PyInit_imp_dummy takes a spec argument which it ignores. Then the test calls it without a spec.

    PyInit_imp_dummy(PyObject *spec)

    Minimized trigger:

    import importlib
    import imp
    spec = importlib.util.find_spec('_testmultiphase')
    imp.load_dynamic("test.imp_dummy", spec.origin)
    

    Causes RuntimeError: null function or function signature mismatch

    @hoodmane
    Copy link
    Mannequin

    hoodmane mannequin commented Apr 2, 2022

    Actually, I think the _PyImport_InitFunc trampoline call is only there as a work around for the problematic test_imp and test_import Python tests. I don't know of any third party code with a problem there.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants