Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the interpreter's (type, val, tb) exception representation #89874

Closed
iritkatriel opened this issue Nov 4, 2021 · 31 comments
Closed
Labels
3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@iritkatriel
Copy link
Member

BPO 45711
Nosy @gvanrossum, @terryjreedy, @scoder, @vstinner, @markshannon, @brandtbucher, @da-woods, @iritkatriel
PRs
  • bpo-45711: Use _PyErr_ClearExcState instead of setting only exc_value… #29404
  • bpo-45711: remove unnecessary DUP_TOP and POP in exception handling #29495
  • bpo-45711: assert that the type of exc_info is redundant #29518
  • bpo-45711: Re-bump the magic number and update doc #29528
  • bpo-45711: use exc_value instead of exc_type to determine if exc_info is valid. Add more assertions. #29627
  • bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance #29780
  • bpo-45711: [asyncio] Normalize exceptions immediately after Fetch, before they are stored as StackItem, which should be normalized #29890
  • bpo-45711: Remove unnecessary normalization of exc_info #29922
  • bpo-45711: Remove type and traceback from exc_info #30122
  • bpo-46219, 46221: simplify except* implementation following exc_info changes. Move helpers to exceptions.c. Do not assume that exception groups are truthy. #30289
  • bpo-45711: move whatsnew entries which are incorrectly listed under N… #30849
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-03-13.19:33:16.899>
    created_at = <Date 2021-11-04.11:29:47.242>
    labels = ['interpreter-core', '3.11', 'performance']
    title = "Simplify the interpreter's (type, val, tb) exception representation"
    updated_at = <Date 2022-03-13.19:33:16.899>
    user = 'https://github.com/iritkatriel'

    bugs.python.org fields:

    activity = <Date 2022-03-13.19:33:16.899>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-03-13.19:33:16.899>
    closer = 'iritkatriel'
    components = ['Interpreter Core']
    creation = <Date 2021-11-04.11:29:47.242>
    creator = 'iritkatriel'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45711
    keywords = ['patch']
    message_count = 31.0
    messages = ['405681', '405839', '405849', '406114', '406120', '406209', '406238', '406983', '406984', '407402', '407597', '408043', '408768', '408786', '409453', '411497', '412264', '412265', '412270', '412272', '412273', '412275', '412281', '412289', '412290', '412292', '412303', '412304', '412316', '414482', '415086']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'terry.reedy', 'scoder', 'vstinner', 'Mark.Shannon', 'brandtbucher', 'da-woods', 'iritkatriel']
    pr_nums = ['29404', '29495', '29518', '29528', '29627', '29780', '29890', '29922', '30122', '30289', '30849']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue45711'
    versions = ['Python 3.11']

    @iritkatriel
    Copy link
    Member Author

    Exceptions are represented in the interpreter as (type, val, tb) triplets which most of the time contain redundant information (the type is the type of val and the tb is also on the exception). This complicates the code and is inefficient as opcodes that manage exceptions push and pop 3 items for each exception.

    We will change the internal representation to be (1) just the exception value if it is normalised and (2) a tuple of the 3 values for the uncommon case where they are all needed.

    See also faster-cpython/ideas#106.

    @iritkatriel iritkatriel added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Nov 4, 2021
    @terryjreedy
    Copy link
    Member

    Would there be any change at the Python level?

    @terryjreedy terryjreedy added the performance Performance or resource usage label Nov 6, 2021
    @iritkatriel
    Copy link
    Member Author

    Initially not, neither in python nor in the c api.

    It would be nice to replace PyErr_Fetch/Restore by a version that takes just an exception but that’s a long deprecation.

    @iritkatriel
    Copy link
    Member Author

    New changeset 05fbd60 by Irit Katriel in branch 'main':
    bpo-45711: Use _PyErr_ClearExcState instead of setting only exc_value to NULL (GH-29404)
    05fbd60

    @markshannon
    Copy link
    Member

    New changeset 4cdeee5 by Irit Katriel in branch 'main':
    bpo-45711: remove unnecessary DUP_TOP and POP in exception handling (GH-29495)
    4cdeee5

    @iritkatriel
    Copy link
    Member Author

    New changeset 8f1b71d by Brandt Bucher in branch 'main':
    bpo-45711: Re-bump the magic number and update doc (GH-29528)
    8f1b71d

    @iritkatriel
    Copy link
    Member Author

    New changeset de3db14 by Irit Katriel in branch 'main':
    bpo-45711: assert that the type of exc_info is redundant (GH-29518)
    de3db14

    @iritkatriel
    Copy link
    Member Author

    New changeset c456dfa by Irit Katriel in branch 'main':
    bpo-45711: use exc_value instead of exc_type to determine if exc_info is valid. Add more assertions. (GH-29627)
    c456dfa

    @iritkatriel
    Copy link
    Member Author

    Following the analysis/discussion on faster-cpython/ideas#106:

    • exc_info is always normalized, so we actually will never need to create a tuple of these three values (exc_curinfo, the exception raised but not yet caught can be unnormalized, but it is not pushed/popped on the stack).

    • We will reduce the interpreter's exc_info representation to just the exception instance.

    • There are two APIs that are impacted, both in non-documented edge cases:

    1. sys.exc_info()[2] can currently be different from sys.exc_info()[1].__traceback__ because changes to the latter (while an except clause is executing) don't show up in the former. This is arguably a bug, we will change it so that the type and traceback are always consistent with the exception.

    2. PyErr_SetExcInfo does no arg checking, and will set exc_info to an inconsistent triplet if you ask it to. However, the exc_value arg must be an exception instance so the only thing you can do is pass in nonsensical args where the type/traceback do not match the exception. This function's purpose is to save/restore exc_info. We will make it ignore the type and traceback and document that change.

    @iritkatriel
    Copy link
    Member Author

    New changeset 8a45ca5 by Irit Katriel in branch 'main':
    bpo-45711: Change exc_info related APIs to derive type and traceback from the exception instance (GH-29780)
    8a45ca5

    @iritkatriel
    Copy link
    Member Author

    New changeset 2ff758b by Irit Katriel in branch 'main':
    bpo-45711: [asyncio] Normalize exceptions immediately after Fetch, before they are stored as StackItem, which should be normalized (GH-29890)
    2ff758b

    @iritkatriel
    Copy link
    Member Author

    New changeset 2109f78 by Irit Katriel in branch 'main':
    bpo-45711: Remove unnecessary normalization of exc_info (GH-29922)
    2109f78

    @scoder
    Copy link
    Contributor

    scoder commented Dec 17, 2021

    FYI, we track the Cython side of this in
    cython/cython#4500

    @markshannon
    Copy link
    Member

    New changeset 396b583 by Irit Katriel in branch 'main':
    bpo-45711: Remove type and traceback from exc_info (GH-30122)
    396b583

    @iritkatriel
    Copy link
    Member Author

    This is done, there will be more cleanups that this enables but they can have their own issues.

    @iritkatriel
    Copy link
    Member Author

    New changeset 80e1def by Irit Katriel in branch 'main':
    bpo-45711: move whatsnew entries which are incorrectly listed under New Features (GH-30849)
    80e1def

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2022

    Would it be possible to revert the change until Cython and numpy are ready for it?

    bpo-45711: Remove type and traceback from exc_info (GH-30122)

    This change broke numpy which uses C code generated by Cython for coroutines in numpy/random/_mt19937.c (file generated from numpy/random/_mt19937.pyx).

    Example of Cython code which no fails with a compiler error since exc_info->exc_type has been removed:

    /* GetTopmostException */
    #if CYTHON_USE_EXC_INFO_STACK
    static _PyErr_StackItem *
    __Pyx_PyErr_GetTopmostException(PyThreadState *tstate)
    {
        _PyErr_StackItem *exc_info = tstate->exc_info;
        while ((exc_info->exc_type == NULL || exc_info->exc_type == Py_None) &&
               exc_info->previous_item != NULL)
        {
            exc_info = exc_info->previous_item;
        }
        return exc_info;
    }
    #endif

    I propose this plan:

    • (1) Revert this specific change
    • (2) Wait until a change is merged in Cython 0.29.x
    • (3) Wait until a new Cython 0.29.x version is released
    • (4) Wait until numpy is released with regenerated code compatible with this change
    • (5) Apply again the change

    It should increase the number of packages when Python 3.11 will be released.

    In Cython, the issue is tracked as: cython/cython#4500

    @vstinner vstinner reopened this Feb 1, 2022
    @iritkatriel
    Copy link
    Member Author

    I can probably just put back the two fields in _PyErr_StackItem and make sure they get updated when exc_value is set.

    Reverting the whole thing would include big changes in compile.c and ceval.c, so I'd rather avoid that if we can.

    @iritkatriel
    Copy link
    Member Author

    Does unsetting CYTHON_USE_EXC_INFO_STACK still work?

    #if CYTHON_USE_EXC_INFO_STACK
    // See  https://bugs.python.org/issue25612
    #define __Pyx_ExcInfoStruct  _PyErr_StackItem
    #else
    // Minimal replacement struct for Py<3.7, without the Py3.7 exception state stack.
    typedef struct {
        PyObject *exc_type;
        PyObject *exc_value;
        PyObject *exc_traceback;
    } __Pyx_ExcInfoStruct;
    #endif

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2022

    In Cython, the issue is tracked as: cython/cython#4500

    I don't understand the Cython status. The code was updated in the master branch:

    cython/cython#4500 was closed even if the change was not backported to 0.29.x.

    I understand that there is no released Cython 0.29.x version compatible with this change.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2022

    I commented the Cython issue:
    cython/cython#4500 (comment)

    I also opened a discussion on python-dev: "Please update Cython before introcuding C API incompatible changes in Python".
    https://mail.python.org/archives/list/python-dev@python.org/thread/RS2C53LDZPXHRR2VCY2G2YSPDVA4LNQU/

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2022

    Irit Katriel:

    Reverting the whole thing would include big changes in compile.c and ceval.c, so I'd rather avoid that if we can.

    It was a good idea to split changes of this issue into multiple commits :-) I'm only proposing to revert the one which introduces the C API incompatible changes: the commit 396b583 "bpo-45711: Remove type and traceback from exc_info (GH-30122)".

    Maybe Cython can be quickly updated, but the situation seems to be complicated :-( cython/cython#4500

    @iritkatriel
    Copy link
    Member Author

    That commit has significant changes in ceval.c and compile.c. They don't need to be reverted to unbreak cython. I'm working on a PR for a simpler change.

    Have you tried with CYTHON_USE_EXC_INFO_STACK undefined?

    @iritkatriel
    Copy link
    Member Author

    If this is still the position of cython maintainers:

    cython/cython#4581 (comment)

    then I will need to revert the change until 3.12.

    @gvanrossum
    Copy link
    Member

    Time to insist on directly communicating with the Cython team (esp. @scoder) and broker some kind of compromise.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2022

    __Pyx_PyErr_GetTopmostException(PyThreadState *tstate)

    Python provides a *private* _PyErr_GetTopmostException(tstate) function, but Cython reimplements its own function. I'm not sure why.

    #74716 proposes adding PyErr_GetActiveException() function which has no parameter, but Cython __Pyx_PyErr_GetTopmostException() has a tstate parameter.

    Simplified example numpy/random/_mt19937.c code:

    static CYTHON_INLINE void
    __Pyx__ExceptionSave(PyThreadState *tstate, PyObject **type,
                         PyObject **value, PyObject **tb)
    {
        _PyErr_StackItem *exc_info = __Pyx_PyErr_GetTopmostException(tstate);
        *type = exc_info->exc_type;
        *value = exc_info->exc_value;
        *tb = exc_info->exc_traceback;
        Py_XINCREF(*type);
        Py_XINCREF(*value);
        Py_XINCREF(*tb);
    }
    
    static CYTHON_INLINE void
    __Pyx__ExceptionReset(PyThreadState *tstate, PyObject *type,
                          PyObject *value, PyObject *tb)
    {
        PyObject *tmp_type, *tmp_value, *tmp_tb;
        _PyErr_StackItem *exc_info = tstate->exc_info;
        tmp_type = exc_info->exc_type;
        tmp_value = exc_info->exc_value;
        tmp_tb = exc_info->exc_traceback;
        exc_info->exc_type = type;
        exc_info->exc_value = value;
        exc_info->exc_traceback = tb;
        Py_XDECREF(tmp_type);
        Py_XDECREF(tmp_value);
        Py_XDECREF(tmp_tb);
    }

    Cython saves/restores the current exception of tstate. Maybe we need to provide a high-level API for that as well?

    @iritkatriel
    Copy link
    Member Author

    This is a backport of @scoder's patch to 0.29.x. (I don't know if this is helpful).

    https://github.com/cython/cython/compare/master...iritkatriel:exc_info?expand=1

    @iritkatriel
    Copy link
    Member Author

    #74716 proposes adding PyErr_GetActiveException() function which has no parameter, but Cython __Pyx_PyErr_GetTopmostException() has a tstate parameter.

    I've now updated it to follow the pattern of other functions, where the is a private function that takes tstate and the public function calls it.

    So it adds in Include/pyerrors.h

    PyAPI_FUNC(PyObject*) PyErr_GetActiveException(void);
    PyAPI_FUNC(void) PyErr_SetActiveException(PyObject *);

    and in Include/cpython/pyerrors.h

    PyAPI_FUNC(PyObject*) _PyErr_GetActiveException(PyThreadState *);
    PyAPI_FUNC(void) _PyErr_SetActiveException(PyThreadState *, PyObject *);

    @da-woods
    Copy link
    Mannequin

    da-woods mannequin commented Feb 1, 2022

    Just a quick comment on Cython and these changes:

    Cython 0.29 can build itself on Python 3.11a4 with CFLAGS="-DCYTHON_FAST_THREAD_STATE=0 -DCYTHON_USE_EXC_INFO_STACK=0" python3.11 setup.py build_ext.

    I think there's some coroutine code left that isn't fixed by those flags, but a large chunk of Cython libraries won't use coroutines and so will work fine with these flags. Hopefully that unblocks some stuff for you.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 3, 2022

    lxml does crash on the current Cython 0.29.x development branch.

    @iritkatriel
    Copy link
    Member Author

    We have agreed on python-dev [1] that the cpython changes will not be reverted, and the issue will be fixed in cython. So I am closing this again.

    [1] https://mail.python.org/archives/list/python-dev@python.org/message/BHIQL4P6F7OPMCAP6U24XEZUPQKI62UT/

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    JukkaL pushed a commit to python/mypy that referenced this issue May 23, 2022
    Eta0 referenced this issue in coreweave/tensorizer Dec 8, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants